llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	69295815ed	[ARM] Update test target triple. NFC	2021-01-18 16:36:00 +00:00
David Green	1454724215	[ARM] Align blocks that are not fallthough targets If the previous block in a function does not fallthough, adding nop's to align it will never be executed. This means we can freely (except for codesize) align more branches. This happens in constantislandspass (as it cannot happen later) and only happens at aggressive optimization levels as it does increase codesize. Differential Revision: https://reviews.llvm.org/D94394	2021-01-16 22:19:35 +00:00
David Green	32556a9832	[ARM] Remove more unused check prefixes, NFC	2020-11-14 15:37:53 +00:00
Francesco Petrogalli	fc2fe6817e	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Sam Parker	1c3ca61294	[ARM][ParallelDSP] Change smlad insertion order Instead of inserting everything after the 'root' of the reduction, insert all instructions as close to their operands as possible. This can help reduce register pressure. Differential Revision: https://reviews.llvm.org/D67392 llvm-svn: 374981	2019-10-16 09:37:03 +00:00
David Green	120a5e9a74	[ARM] Cortex-M4 schedule additions This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163	2019-09-29 08:38:48 +00:00
Simon Pilgrim	9758407bf1	[TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG support. llvm-svn: 367096	2019-07-26 09:41:08 +00:00
Simon Pilgrim	cb5f7de448	[ARM][ParallelDSP] Regenerate multi-use-loads.ll test checks llvm-svn: 367094	2019-07-26 09:32:21 +00:00
David Green	d2d0f46cd2	[ARM] Cortex-M4 schedule This patch adds a simple Cortex-M4 schedule, renaming the existing M3 schedule to M4 and filling in the latencies as-per the Cortex-M4 TRM: https://developer.arm.com/docs/ddi0439/latest Most of these are 1, with the important exception being loads taking 2 cycles. A few others are also higher, but I don't believe they make a large difference. I've repurposed the M3 schedule as the latencies are mostly the same between the two cores, with the M4 having more FP and DSP instructions. We also turn on MISched and UseAA for the cores that now use this. It also adds some schedule Write's to various instruction to make things simpler. Differential Revision: https://reviews.llvm.org/D54142 llvm-svn: 360768	2019-05-15 12:41:58 +00:00
Sam Parker	9e73020bfa	[ARM][ParallelDSP] Disable for big-endian Bail early when we don't have a preheader and also if the target is big endian because it's written with only little endian in mind! Differential Revision: https://reviews.llvm.org/D59368 llvm-svn: 356243	2019-03-15 10:19:32 +00:00
Sam Parker	0a833d0ad2	[NFC][ARM] Update test Change some regex to handle commutable instructions. llvm-svn: 356159	2019-03-14 15:36:54 +00:00
Sam Parker	4c4ff13d3c	[ARM][ParallelDSP] Enable multiple uses of loads When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 llvm-svn: 356132	2019-03-14 11:14:13 +00:00

12 Commits