Commit Graph

171 Commits

Author SHA1 Message Date
Ayke van Laethem a560e57a7e
[AVR] Only push and clear R1 in interrupts when necessary
R1 is a reserved register, but LLVM gives the APIs to know when it is
used or not. So this patch uses these APIs to only save/clear/restore R1
in interrupts when necessary.

The main issue here was getting inline assembly to work. One could argue
that this is the job of Clang, but for consistency I've made sure that
R1 is always usable in inline assembly even if that means clearing it
when it might not be needed.

Information on inline assembly in AVR can be found here:

https://www.nongnu.org/avr-libc/user-manual/inline_asm.html#asm_code

Essentially, this seems to suggest that r1 can be freely used in avr-gcc
inline assembly, even without specifying it as an input operand.

Differential Revision: https://reviews.llvm.org/D117426
2022-08-15 14:29:38 +02:00
Ayke van Laethem 43a8dbc5be
[AVR] Use @earlyclobber instead of register scavenging
The code to support the case when the register allocator has assigned
the same register to the src and the dst register operand isn't actually
needed:

  * LDWRdPtr and LDDWRdPtrQ have an @earlyclobber on the output
    register, so the register allocator will make sure to allocate a
    different register for the output register.
  * LDDWRdYQ does not have an @earlyclobber, but the pointer register is
    the fixed Y register which is reserved. The register allocator won't
    use reserved registers for the output value.

This removes a special case in the code that makes the pseudo
instruction expansion pass more complicated than it needs to be.

Differential Revision: https://reviews.llvm.org/D131844
2022-08-15 14:29:38 +02:00
Ayke van Laethem de48717fcf
[AVR] Support unaligned store
This patch really just extends D39946 towards stores as well as loads.
While the patch is in SelectionDAGBuilder, it only applies to AVR (the
only target that supports unaligned atomic operations).

Differential Revision: https://reviews.llvm.org/D128483
2022-08-15 14:29:37 +02:00
Patryk Wychowaniec 5650688e72 [AVR] Fix expanding MOVW for overlapping registers
When expanding a MOVW (16-bit copy) to two MOVs (8-bit copy), the
lower byte always comes first. This is incorrect for corner cases like
'$r24r23 -> $r25r24', in which the higher byte copy should come first.

Current patch fixes that bug as recorded at
https://github.com/rust-lang/rust/issues/98167

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D128588
2022-06-26 17:20:07 +08:00
Ivan Kosarev ad1d60c3be [FileCheck] Catch missspelled directives.
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125604
2022-05-26 11:37:19 +01:00
Patryk Wychowaniec 6641c57aeb [AVR] Always expand STDSPQRr & STDWSPQRr
Currently, STDSPQRr and STDWSPQRr are expanded only during
AVRFrameLowering - this means that if any of those instructions happen
to appear _outside_ of the typical FrameSetup / FrameDestroy
context, they wouldn't get substituted, eventually leading to a crash:

```
LLVM ERROR: Not supported instr: <MCInst XXX <MCOperand Reg:1>
<MCOperand Imm:15> <MCOperand Reg:53>>
```

This commit fixes this issue by moving expansion of those two opcodes
into AVRExpandPseudo.

This bug was originally discovered due to the Rust compiler_builtins
library. Its 0.1.37 release contained a 128-bit software
division/remainder routine that exercised this buggy branch in the code.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123528
2022-05-05 03:10:59 +00:00
Patryk Wychowaniec d16a631c12 [AVR] Merge AVRRelaxMemOperations into AVRExpandPseudoInsts
This commit contains a refactoring that merges AVRRelaxMemOperations
into AVRExpandPseudoInsts, so that we have a single place in code that
expands the STDWPtrQRr opcode.

Seizing the day, I've also fixed a couple of potential bugs with our
previous implementation (e.g. when the destination register was killed,
the previous implementation would try to .addDef() that killed
register, crashing LLVM in the process - that's fixed now, as proved by
the test).

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D122533
2022-04-11 02:42:13 +00:00
Ben Shi bce2e208e0 [AVR] Optimize int16 airthmetic right shift for shift amount 7/14/15
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115618
2022-03-26 06:53:27 +00:00
Ben Shi 49b0b5f0fa [AVR][NFC] Fix incorrect register states in expanding pseudo instructions
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D118354
2022-03-25 16:02:15 +00:00
Ben Shi f319c24570 [AVR] Reject/Reserve R0~R15 on AVRTiny.
Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D121672
2022-03-24 02:33:51 +00:00
Ben Shi d7afea9eb8 [AVR][MC] Emit some aliases for GPRs and IO registers
Emit the following aliases (if available):

.set __tmp_reg__, [0|16]
.set __zero_reg__, [1|17]
.set __SREG__, 63
.set __SP_H__, 62
.set __SP_L__, 61
.set __EIND__, 60
.set __RAMPZ__, 59

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D119807
2022-03-24 02:08:22 +00:00
Ben Shi 45638931fb [AVR] Generate 'rcall' instead of 'call' on avr2 and avr25
The 'call' (long call) instruction is available on avr3 and above,
and devices in avr2 and avr25 should use the 'rcall' (short call)
instruction for function calls.

Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D121539
2022-03-23 02:00:15 +00:00
Ben Shi 3fd9a320da [AVR] Fix incorrect calling convention for varargs functions
An i8 argument should only cost 1 byte on the stack. This is
compatible with avr-gcc.

There are also more test cases (of calling convention) are added.

Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D121767
2022-03-23 02:00:15 +00:00
Ben Shi fa2d31e9e6 [AVR] Fix a potential assert failure
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D119416
2022-02-11 02:25:58 +00:00
Ayke van Laethem 44ee9864a4
[AVR][NFC] Make atomics tests easier to read
Use the same mnemonics in the tests that are used in the AtomicLoadOp
pattern ($rd, $rr) but use RR1 instead of $operand. This matches similar
tests in load8.ll.

Differential Revision: https://reviews.llvm.org/D117991
2022-02-02 09:10:39 +01:00
Ayke van Laethem 316664783d
[AVR] Fix atomicrmw result value
This patch fixes the atomicrmw result value to be the value before the
operation instead of the value after the operation. This was a bug, left
as a FIXME in the code (see https://reviews.llvm.org/D97127).

From the LangRef:

> The contents of memory at the location specified by the <pointer>
> operand are atomically read, modified, and written back. The original
> value at the location is returned.

Doing this expansion early allows the register allocator to arrange
registers in such a way that commutable operations are simply swapped
around as needed, which results in shorter code while still being
correct.

Differential Revision: https://reviews.llvm.org/D117725
2022-02-02 09:10:39 +01:00
Ayke van Laethem 116ab78694
[AVR] Make use of the constant value 0 in R1
The register R1 is defined to have the constant value 0 in the avr-gcc
calling convention (which we follow). Unfortunately, we don't really
make use of it. This patch replaces `LDI 0` instructions with a copy
from R1.

This reduces code size: my AVR build of compiler-rt goes from 50660 to
50240 bytes of code size, which is a 0.8% reduction. Presumably it will
also improve execution speed, although I didn't measure this.

Differential Revision: https://reviews.llvm.org/D117425
2022-01-23 17:08:01 +01:00
Ayke van Laethem 153359180a
[AVR] Remove regalloc workaround for LDDWRdPtrQ
Background: https://github.com/avr-rust/rust-legacy-fork/issues/126

In short, this workaround was introduced to fix a "ran out of registers
during regalloc" issue. The root cause has since been fixed in
https://reviews.llvm.org/D54218 so this workaround can be removed.

There is one test that changes a little bit, removing a single
instruction. I also compiled compiler-rt before and after this patch but
didn't see a difference. So presumably the impact is very low. Still,
it's nice to be able to remove such a workaround.

Differential Revision: https://reviews.llvm.org/D117831
2022-01-23 17:08:00 +01:00
Ben Shi 94173dc24c [AVR] Generate ELPM for loading byte/word from extended program memory
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D116493
2022-01-20 02:53:10 +00:00
Ben Shi c1dd607463 [AVR][MC] Generate section '.progmemX.data' for extended flash banks
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115987
2022-01-20 02:53:10 +00:00
Ayke van Laethem ca27b026f9
[AVR] Do not clear r0 at interrupt entry
There is no reason to do this: it's a scratch register and can therefore
hold any arbitrary value. And because it is in an interrupt, this code
is performance critical so it should be as short as possible.

I believe r0 was cleared because of the following:

 1. There used to be a bug that the cleared register was r0, not r1 as
    it should have been.
 2. This was fixed in https://reviews.llvm.org/D99467, but left the code
    to clear r0.

This patch completes D99467 by removing the `clr r0` instruction.

Differential Revision: https://reviews.llvm.org/D116756
2022-01-19 14:22:13 +01:00
Ayke van Laethem 3d59d94a20
[AVR] Mark call-clobbered registers as clobbered in interrupt handlers
I have matched the RISCV backend, which only uses the interrupt save
list in getCalleeSavedRegs, _not_ in getCallPreservedMask. I don't know
the details of these two methods, but with it, the correct amount of
registers is saved and restored.

Without this patch, practically all interrupt handlers that call a
function will miscompile.

I have added a test to verify this behavior. I've also added a very
simple test to verify that more normal interrupt operations (in this
case, incrementing a global value) behave as expected.

Differential Revision: https://reviews.llvm.org/D116551
2022-01-19 14:22:13 +01:00
Ayke van Laethem f41d2d9469
[AVR] Remove redundant dynalloca SP save/restore pass
I think this pass was previously used under the assumption that most
functions would not need a frame pointer and it would be more efficient
to store the old stack pointer in a regular register pair.

Unfortunately, right now we're forced to always reserve the Y register
as a frame pointer: whether or not this is needed is only known after
regsiter allocation at which point it doesn't make sense anymore to mark
it as non-reserved. Therefore, it makes sense to use the Y register to
store the old stack pointer in functions with dynamic allocas (with a
variable size or not in the entry block). Knowing this can make the code
around dynamic allocas a lot simpler: simply save/restore the frame
pointer.

This is especially relevant in functions that have a frame pointer
anyway (for example, because they have stack spills). The stack restore
in the epilogue will implicitly restore the old stack pointer, so there
is no need to store the old stack pointer separately. It even reduces
register pressure as a side effect.

Differential Revision: https://reviews.llvm.org/D97815
2022-01-19 14:22:13 +01:00
Nikita Popov f430c1eb64 [Tests] Add elementtype attribute to indirect inline asm operands (NFC)
This updates LLVM tests for D116531 by adding elementtype attributes
to operands that correspond to indirect asm constraints.
2022-01-06 14:23:51 +01:00
Ben Shi 99e7bf46c9 [AVR] Optimize int16 shift operation for shift amount greater than 8
Skip operation on the lower byte in int16 logical left shift when
shift amount is greater than 8.

Skip operation on the higher byte in int16 logical & arithmetic
right shift when shift amount is greater than 8.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115594
2022-01-04 11:48:50 +00:00
Ben Shi f4ef79306c [AVR] Optimize int8 arithmetic right shift 6 bits
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115593
2022-01-04 10:36:03 +00:00
Ben Shi 9fb4e79d06 Revert "[AVR] Optimize int8 arithmetic right shift 6 bits"
This reverts commit 5723261370.

There are failures as reported in

https://lab.llvm.org/buildbot#builders/16/builds/21638
https://lab.llvm.org/buildbot#builders/104/builds/5394
2022-01-04 04:14:15 +00:00
Ben Shi 5723261370 [AVR] Optimize int8 arithmetic right shift 6 bits
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115593
2022-01-04 03:20:29 +00:00
Nico Weber 4f9a5c2a14 [asm] Remove explicit branch for modifier 'l'
No intended behavior change.

EmitGCCInlineAsmStr() used to explicitly check for modifier 'l'
after handling block address and machine basic block operands.
This prevented passing a MachineOperand with 'l' modifier to
PrintAsmMemoryOperand(). Conceptually that seems kind of nice,
but in practice the overrides of PrintAsmMemoryOperand() in all (*)
AsmPrinter subclasses already reject modifiers they don't know about,
and none of them don't know about 'l'. So removing this doesn't have
a behavior difference, is less code, and it makes EmitGCCInlineAsmStr()
and EmitMSInlineAsmStr() more similar, to prepare for merging them later.

(Why not _add_ the branch to EmitMSInlineAsmStr() instead? Because that
always works with X86AsmPrinter I think, and
X86AsmPrinter::PrintAsmMemoryOperand() very decisively rejects the 'l'
modifier, so it's hard to motivate adding that branch.)

*: The one exception was AVRAsmPrinter, which had an llvm_unreachable instead
of returning true. So this commit changes that, so that the AVR target keeps
emitting an error instead of crashing when passing a mem operand with a :l
modifier to it. All the other targets already don't crash on this.

Differential Revision: https://reviews.llvm.org/D114216
2021-11-19 09:19:53 -05:00
Guozhi Wei 6599961c17 [TwoAddressInstructionPass] Improve the SrcRegMap and DstRegMap computation
This patch contains following enhancements to SrcRegMap and DstRegMap:

  1 In findOnlyInterestingUse not only check if the Reg is two address usage,
    but also check after commutation can it be two address usage.

  2 If a physical register is clobbered, remove SrcRegMap entries that are
    mapped to it.

  3 In processTiedPairs, when create a new COPY instruction, add a SrcRegMap
    entry only when the COPY instruction is coalescable. (The COPY src is
    killed)

With these enhancements isProfitableToCommute can do better commute decision,
and finally more register copies are removed.

Differential Revision: https://reviews.llvm.org/D108731
2021-10-11 15:28:31 -07:00
Matt Jacobson 75abeb64ce [AVR] emit 'MCSA_Global' references to '__do_global_ctors' and '__do_global_dtors'
Emit references to '__do_global_ctors' and '__do_global_dtors' to allow
constructor/destructor routines to run.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D107133
2021-08-05 10:37:36 +08:00
Ayke van Laethem 4d7f5c0a85
[AVR] Only support sp, r0 and r1 in llvm.read_register
Most other registers are allocatable and therefore cannot be used.

This issue was flagged by the machine verifier, because reading other
registers is considered reading from an undefined register.

Differential Revision: https://reviews.llvm.org/D96969
2021-07-24 14:03:27 +02:00
Ayke van Laethem 41f905b211
[AVR] Fix rotate instructions
This patch fixes some issues with the RORB pseudo instruction.

  - A minor issue in which the instructions were said to use the SREG,
    which is not true.
  - An issue with the BLD instruction, which did not have an output operand.
  - A major issue in which invalid instructions were generated. The fix
    also reduce RORB from 4 to 3 instructions, so it's also a small
    optimization.

These issues were flagged by the machine verifier.

Differential Revision: https://reviews.llvm.org/D96957
2021-07-24 14:03:26 +02:00
Ayke van Laethem 6aa9e746eb
[AVR] Expand large shifts early in IR
This patch makes sure shift instructions such as this one:

    %result = shl i32 %n, %amount

are expanded just before the IR to SelectionDAG conversion to a loop so
that calls to non-existing library functions such as __ashlsi3 are
avoided. The generated code is currently pretty bad but there's a lot of
room for improvement: the shift itself can be done in just four
instructions.

Differential Revision: https://reviews.llvm.org/D96677
2021-07-24 14:03:26 +02:00
Ayke van Laethem feda08b70a
[AVR] Do not chain stores in call frame setup
Previously, AVRTargetLowering::LowerCall attempted to keep stack stores
in order with chains. Perhaps this worked in the past, but it does not
work now: it appears that the SelectionDAG legalization phase removes
these chains. Therefore, I've removed these chains entirely to match
X86 (which, similar to AVR, also prefers to use push instructions over
stack-relative stores to set up a call frame). With this change, all the
stack stores are in a somewhat reasonable order.

Differential Revision: https://reviews.llvm.org/D97853
2021-07-24 14:03:26 +02:00
Alex Richardson c142c06c19 Place the BlockAddress type in the address space of the containing function
While this should not matter for most architectures (where the program
address space is 0), it is important for CHERI (and therefore Arm Morello).
We use address space 200 for all of our code pointers and without this
change we assert in the SelectionDAG handling of BlockAddress nodes.

It is also useful for AVR: previously programs targeting
AVR that attempt to read their own machine code
via a pointer to a label would instead read from RAM
using a pointer relative to the the start of program flash.

Reviewed By: dylanmckay, theraven
Differential Revision: https://reviews.llvm.org/D48803
2021-07-02 12:17:55 +01:00
Ben Shi c85175c5f6 [AVR] Fix a bug in prologue of ISR
The r1 register should be cleared in prologue of ISR as it is used
as constant zero.

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D99467
2021-06-29 21:44:50 +08:00
Ben Shi 1dd2d15b50 [AVR][test] Add a new test: functions with struct return type
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D99239
2021-06-28 21:19:26 +08:00
Ben Shi 86812faa5f [AVR] Improve inline assembly
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D96394
2021-05-30 23:44:43 +08:00
Ayke van Laethem a1155ae64d
[AVR] Fix lifeness issues in the AVR backend
This patch is a large number of small changes that should hopefully not
affect the generated machine code but are still important to get right
so that the machine verifier won't complain about them.

The llvm/test/CodeGen/AVR/pseudo/*.mir changes are also necessary
because without the liveins the used registers are considered undefined
by the machine verifier and it will complain about them.

Differential Revision: https://reviews.llvm.org/D97172
2021-03-04 14:04:39 +01:00
Ayke van Laethem 15f495c0bc
[AVR] Fix def state of operands
Some instructions (especially mov+pop instructions) were setting the
wrong operands. For example, the pop instruction had the register set as
a source operand while it is a destination operand (the value is loaded
into the register).

I have found these issues using the machine verifier and using manual
code inspection.

Differential Revision: https://reviews.llvm.org/D97159
2021-03-03 15:36:05 +01:00
Ayke van Laethem bbfef8ac95
[AVR] Fix expansion of NEGW
The previous expansion used SBCI, which is incorrect because the NEGW
pseudo instruction accepts a DREGS operand (2xGPR8) and SBCI only allows
LD8 registers. One solution could be to correct the NEGW pseudo
instruction, but another solution is to use a different instruction
(sbc) that does accept a GPR8 register and therefore allows more freedom
to the register allocator.

The output now matches avr-gcc for the following code:

    int foo(int n) {
        return -n;
    }

I've found this issue using the machine instruction verifier: it was
complaining about the wrong register class in NEGWRd.mir.

Differential Revision: https://reviews.llvm.org/D97131
2021-03-03 15:36:05 +01:00
Ben Shi efb1cb752b [AVR] Fix a bug in 16-bit shifts
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D96590
2021-02-14 11:54:55 +08:00
Dylan McKay 2ccb941740 [AVR] Fix global references to function symbols
References to functions are in program memory and need a `pm()` fixup. This should fix trait objects for Rust on AVR.

Differential Revision: https://reviews.llvm.org/D87631

Patch by Alex Mikhalev.
2021-02-10 00:40:49 +13:00
Ben Shi 50f1aa1db5 [AVR] Optimize 16-bit int shift
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D90092
2021-01-28 15:10:11 +08:00
Ben Shi 2d7aa149a4 [update_llc_test_checks] Support AVR
Reviewed By: arichardson

Differential Revision: https://reviews.llvm.org/D95240
2021-01-26 17:50:56 +08:00
Ben Shi 2a4acf3ea8 [AVR] Optimize 8-bit int shift
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D90678
2021-01-24 11:04:37 +08:00
Ben Shi 1eb8c5cd35 [AVR] Optimize 16-bit comparison with constant
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D93976
2021-01-24 00:38:57 +08:00
Ben Shi 25531a1d96 [AVR] Optimize 8-bit logic left/right shifts
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D89047
2021-01-23 23:54:16 +08:00
Ben Shi 9f8f8db339 [AVR] Optimize the 16-bit NEGW pseudo instruction
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D88658
2020-11-17 17:51:58 +08:00