The PPC backend doesn't handle these correctly. This patch uses logic
similar to that in the X86 and ARM backends to track these arguments
properly.
llvm-svn: 175635
Add HexagonMCInst class which adds various Hexagon VLIW annotations.
In addition, this class also includes some APIs related to the
constant extenders.
llvm-svn: 175634
During lowering of a BUILD_VECTOR, we look for opportunities to use a
vector splat. When the splatted value fits in 5 signed bits, a single
splat does the job. When it doesn't fit in 5 bits but does fit in 6,
and is an even value, we can splat on half the value and add the result
to itself.
This last optimization hasn't been working recently because of improved
constant folding. To circumvent this, create a pseudo VADD_SPLAT that
can be expanded during instruction selection.
llvm-svn: 175632
sext <4 x i1> to <4 x i64>
sext <4 x i8> to <4 x i64>
sext <4 x i16> to <4 x i64>
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
(sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.
I also added a cost of this operations to the AVX costs table.
llvm-svn: 175619
It is possible that frame pointer is not found in the
callee saved info, thus FramePtrSpillFI may be incorrect
if we don't check the result of hasFP(MF).
Besides, if we enable the stack coloring algorithm, there
will be an assertion to ensure the slot is live. But in
the test case, %var1 is not live in the prologue of the
function, and we will get the assertion failure.
Note: There is similar code in ARMFrameLowering.cpp.
llvm-svn: 175616
SltCCRxRy16, SltiCCRxImmX16, SltiuCCRxImmX16, SltuCCRxRy16
$T8 shows up as register $24 when emitted from C++ code so we had
to change some tests that were already there for this functionality.
llvm-svn: 175593
MS-style inline assembly.
This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it
will be used. Therefore, force the base pointer as well. We now treat MS
inline assembly in the same way we treat functions with dynamic stack
realignment and VLAs. This guarantees the BP will be used to reference
parameters and locals.
rdar://13218191
llvm-svn: 175576
excluding visibility bits.
Mips (o32 abi) specific e_header setting.
EF_MIPS_ABI_O32 needs to be set in the
ELF header flags for o32 abi output.
Contributer: Reed Kotler
llvm-svn: 175569
excluding visibility bits.
Mips (Mips16) specific e_header setting.
EF_MIPS_ARCH_ASE_M16 needs to be set in the
ELF header flags for Mips16.
Contributer: Reed Kotler
llvm-svn: 175566
excluding visibility bits.
Mips (MicroMips) specific STO handling .
The st_other field settig for STO_MIPS_MICROMIPS
Contributer: Zoran Jovanovic
llvm-svn: 175564
In my previous commit:
"Merge a f32 bitcast of a v2i32 extractelt
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers."
I added a pattern containing a copy_to_regclass. The copy_to_regclass is
actually not needed.
radar://13191881
llvm-svn: 175555
When creating an allocation hint for a register pair, make sure the hint
for the physical register reference is still in the allocation order.
rdar://13240556
llvm-svn: 175541
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers.
The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract
the element from the vector instead.
radar://13191881
llvm-svn: 175520
This stops the Machine Verifier from complaining about uses of undefined
physical registers.
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175518
Kernel function arguments are lowered to loads from the PARAM_I address
space. When creating these load instructions, we were initializing
their MachinePointerInfo with an Arguement object that was not attached
to any function. This was causing the MachineScheduler to crash when
it tried to access the parent of the Arguement.
This has been fixed by initializing the MachinePointerInfo with a
UndefValue instead.
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175517
In some cases, we were losing track of live implicit registers which
was creating dead defs and causing the scheduler to produce invalid
code.
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175516
fields were only ever set in the constructor. The create method retains
its consistent interface so that these bits can be re-threaded through
the emitter if they're ever needed.
This was found by the -Wunused-private-field Clang warning.
llvm-svn: 175482
at this time, llvm is generating a different but equivalent pattern
that would lead to this instruction. I am trying to think of a way
to get it to generate this. If I can't, I may just remove the pseudo.
llvm-svn: 175419
This expansion will be moved to expandISelPseudos as soon as I can figure
out how to do that. There are other instructions which use this
ExpandFEXT_T8I816_ins and as soon as I have finished expanding them all,
I will delete the macro asm string text so it has no way to be used
in the future.
llvm-svn: 175413
GCC warns about the attribute being ignored if it occurs after void*.
There seems to be some kind of incompatibility between clang and gcc here, but
I can't fathom who's right.
void* LLVM_LIBRARY_VISIBILITY foo(); // clang: hidden, gcc: default
LLVM_LIBRARY_VISIBILITY void *bar(); // clang: hidden, gcc: hidden
void LLVM_LIBRARY_VISIBILITY qux(); // clang: hidden, gcc: hidden
llvm-svn: 175394
as well as 16/32 bit variants to do and so I want this to look nice
when I do it. I've been experimenting with this. No new test cases
are needed.
llvm-svn: 175369
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175356
It's completely unnecessary and can be replace with proper
SReg_64 handling instead.
This actually fixes a piglit test on SI.
v2: use correct register class in addRegisterClass,
set special classes as not allocatable
v3: revert setting special classes as not allocateable
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175355
Seems to be allot simpler, and also paves the
way for further improvements.
v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW,
use VGPR0 in dummy EXP, avoid compiler warning, break
after encoding the first literal.
v3: correctly use V_ADD_F32_e64
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175354
Mark all the operands that can also have an immediate.
v2: SOFFSET is also an SSrc_32 operand
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175353
Previously it only worked because of coincident.
v2: fix 64bit versions, use 0x80 (inline 0) instead of SGPR0
for the unused SRC2
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175352
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175351
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175350
Stop adding more instructions than necessary.
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175349
Generate more than one loop if it seems to make sense.
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175348
Using the new NearestCommonDominator class.
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175347
Using the new NearestCommonDominator class.
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175346
This is a candidate for the stable branch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175345
If the frame pointer is omitted, and any stack changes occur in the inline
assembly, e.g.: "pusha", then any C local variable or C argument references
will be incorrect.
I pass no judgement on anyone who would do such a thing. ;)
rdar://13218191
llvm-svn: 175334
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.
llvm-svn: 175314
functions. Set AddedComplexity to determine the order in which patterns are
matched.
This simplifies selection of floating point loads/stores.
No functionality change intended.
llvm-svn: 175300
of the old jit and which we don't intend to support in mips16 or micromips.
This dependency is for the testing of whether an instruction is a pseudo.
llvm-svn: 175297
This is essentially a stripped-down version of the ConstandIslands pass (which
always had these two functions), providing just the features necessary for
correctness.
In particular there needs to be a way to resolve the situation where a
conditional branch's destination block ends up out of range.
This issue crops up when self-hosting for AArch64.
llvm-svn: 175269
blocks. We still don't have consensus if we should try to change clang or
the standard, but llvm should work with compilers that implement the current
standard and mangle those functions.
llvm-svn: 175267
This implements the review suggestion to simplify the AArch64 backend. If we
later discover that we *really* need the extra complexity of the
ConstantIslands pass for performance reasons it can be resurrected.
llvm-svn: 175258
In the near future litpools will be in a different section, which means that
any access to them is at least two instructions. This makes the case for a
movz/movk pair (if total offset <= 32-bits) even more compelling.
llvm-svn: 175257
not matter but makes it more gcc compatible which avoids possible subtle
problems. Also, turned back on a disabled check in helloworld.ll.
llvm-svn: 175237
assembler should also accept a two arg form, as the docuemntation specifies that
the first (destination) register is optional.
This patch uses TwoOperandAliasConstraint to add the two argument form.
It also fixes an 80-column formatting problem in:
test/MC/ARM/neon-bitwise-encoding
<rdar://problem/12909419> Clang rejects ARM NEON assembly instructions
llvm-svn: 175221
1. Define and use function terminateSearch.
2. Use MachineBasicBlock::iterator instead of MachineBasicBlock::instr_iterator.
3. Delete the line which checks whether an instruction is a pseudo.
llvm-svn: 175219
This patch doesn't introduce any functionality changes.
It adds some new fields to the Hexagon instruction classes and
changes their layout to support instruction encoding.
llvm-svn: 175205