The 'call' (long call) instruction is available on avr3 and above,
and devices in avr2 and avr25 should use the 'rcall' (short call)
instruction for function calls.
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D121539
An i8 argument should only cost 1 byte on the stack. This is
compatible with avr-gcc.
There are also more test cases (of calling convention) are added.
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D121767
Most notably, Pass.h is no longer included by TargetMachine.h
before: 1063570306
after: 1063332844
Differential Revision: https://reviews.llvm.org/D121168
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
Apparently GCC 5.4 (a supported compiler) has a bug where it will
use the "MachineInstr &MI" defined by the range-based for loop
to evaluate the for loop expression. Pick a different variable
name to avoid this.
This patch fixes the atomicrmw result value to be the value before the
operation instead of the value after the operation. This was a bug, left
as a FIXME in the code (see https://reviews.llvm.org/D97127).
From the LangRef:
> The contents of memory at the location specified by the <pointer>
> operand are atomically read, modified, and written back. The original
> value at the location is returned.
Doing this expansion early allows the register allocator to arrange
registers in such a way that commutable operations are simply swapped
around as needed, which results in shorter code while still being
correct.
Differential Revision: https://reviews.llvm.org/D117725
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
The register R1 is defined to have the constant value 0 in the avr-gcc
calling convention (which we follow). Unfortunately, we don't really
make use of it. This patch replaces `LDI 0` instructions with a copy
from R1.
This reduces code size: my AVR build of compiler-rt goes from 50660 to
50240 bytes of code size, which is a 0.8% reduction. Presumably it will
also improve execution speed, although I didn't measure this.
Differential Revision: https://reviews.llvm.org/D117425
Background: https://github.com/avr-rust/rust-legacy-fork/issues/126
In short, this workaround was introduced to fix a "ran out of registers
during regalloc" issue. The root cause has since been fixed in
https://reviews.llvm.org/D54218 so this workaround can be removed.
There is one test that changes a little bit, removing a single
instruction. I also compiled compiler-rt before and after this patch but
didn't see a difference. So presumably the impact is very low. Still,
it's nice to be able to remove such a workaround.
Differential Revision: https://reviews.llvm.org/D117831
There is no reason to do this: it's a scratch register and can therefore
hold any arbitrary value. And because it is in an interrupt, this code
is performance critical so it should be as short as possible.
I believe r0 was cleared because of the following:
1. There used to be a bug that the cleared register was r0, not r1 as
it should have been.
2. This was fixed in https://reviews.llvm.org/D99467, but left the code
to clear r0.
This patch completes D99467 by removing the `clr r0` instruction.
Differential Revision: https://reviews.llvm.org/D116756
I have matched the RISCV backend, which only uses the interrupt save
list in getCalleeSavedRegs, _not_ in getCallPreservedMask. I don't know
the details of these two methods, but with it, the correct amount of
registers is saved and restored.
Without this patch, practically all interrupt handlers that call a
function will miscompile.
I have added a test to verify this behavior. I've also added a very
simple test to verify that more normal interrupt operations (in this
case, incrementing a global value) behave as expected.
Differential Revision: https://reviews.llvm.org/D116551
I think this pass was previously used under the assumption that most
functions would not need a frame pointer and it would be more efficient
to store the old stack pointer in a regular register pair.
Unfortunately, right now we're forced to always reserve the Y register
as a frame pointer: whether or not this is needed is only known after
regsiter allocation at which point it doesn't make sense anymore to mark
it as non-reserved. Therefore, it makes sense to use the Y register to
store the old stack pointer in functions with dynamic allocas (with a
variable size or not in the entry block). Knowing this can make the code
around dynamic allocas a lot simpler: simply save/restore the frame
pointer.
This is especially relevant in functions that have a frame pointer
anyway (for example, because they have stack spills). The stack restore
in the epilogue will implicitly restore the old stack pointer, so there
is no need to store the old stack pointer separately. It even reduces
register pressure as a side effect.
Differential Revision: https://reviews.llvm.org/D97815
Skip operation on the lower byte in int16 logical left shift when
shift amount is greater than 8.
Skip operation on the higher byte in int16 logical & arithmetic
right shift when shift amount is greater than 8.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D115594
This reverts commit fd4808887e.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be
explicitly initialized in the copy constructor [-Wextra]
No intended behavior change.
EmitGCCInlineAsmStr() used to explicitly check for modifier 'l'
after handling block address and machine basic block operands.
This prevented passing a MachineOperand with 'l' modifier to
PrintAsmMemoryOperand(). Conceptually that seems kind of nice,
but in practice the overrides of PrintAsmMemoryOperand() in all (*)
AsmPrinter subclasses already reject modifiers they don't know about,
and none of them don't know about 'l'. So removing this doesn't have
a behavior difference, is less code, and it makes EmitGCCInlineAsmStr()
and EmitMSInlineAsmStr() more similar, to prepare for merging them later.
(Why not _add_ the branch to EmitMSInlineAsmStr() instead? Because that
always works with X86AsmPrinter I think, and
X86AsmPrinter::PrintAsmMemoryOperand() very decisively rejects the 'l'
modifier, so it's hard to motivate adding that branch.)
*: The one exception was AVRAsmPrinter, which had an llvm_unreachable instead
of returning true. So this commit changes that, so that the AVR target keeps
emitting an error instead of crashing when passing a mem operand with a :l
modifier to it. All the other targets already don't crash on this.
Differential Revision: https://reviews.llvm.org/D114216
- When an unconditional branch is expanded into an indirect branch, if
there is no scavenged register, an SGPR pair needs spilling to enable
the destination PC calculation. In addition, before jumping into the
destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
spilling/restoring of that SGPR pair reuses the regular SGPR spilling
support but without spilling it into memory. As that spilling and
restoring points are fully controlled, we only need to spill that SGPR
into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
where the restoring code is filled. After that, the relaxation will
place that restore block directly before the destination block and
insert an unconditional branch in any fall-through block into the
destination block.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D106449
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
The current inconsistency confuse contributors which coding guidlines to follow.
It would be better to have it consistent using clang-format tool.
Reviewed By: mhjacobson
Differential Revision: https://reviews.llvm.org/D109270
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().
Add hasRetAttr() since it was missing from AttributeList.
Emit references to '__do_global_ctors' and '__do_global_dtors' to allow
constructor/destructor routines to run.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D107133
Most other registers are allocatable and therefore cannot be used.
This issue was flagged by the machine verifier, because reading other
registers is considered reading from an undefined register.
Differential Revision: https://reviews.llvm.org/D96969
This patch fixes some issues with the RORB pseudo instruction.
- A minor issue in which the instructions were said to use the SREG,
which is not true.
- An issue with the BLD instruction, which did not have an output operand.
- A major issue in which invalid instructions were generated. The fix
also reduce RORB from 4 to 3 instructions, so it's also a small
optimization.
These issues were flagged by the machine verifier.
Differential Revision: https://reviews.llvm.org/D96957