llvm-project

Commit Graph

Author	SHA1	Message	Date
Ivan A. Kosarev	60a991ed1a	[NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825	2018-06-02 16:40:03 +00:00
Ivan A. Kosarev	73c5337a64	Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)" The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824	2018-06-02 16:38:38 +00:00
Ivan A. Kosarev	51f19b9ee1	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819	2018-06-02 16:26:42 +00:00
Roman Tereshin	667c7581ed	[GlobalISel][ARM] LegalizerInfo verifier: Adding LegalizerInfo::verify(...) call and fixing bugs exposed Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333663	2018-05-31 16:16:48 +00:00
Amaury Sechet	f47d9f30b0	[ARM] Remove code handling ADDC/ADDE/SUBC/SUBE Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY . Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D47413 llvm-svn: 333544	2018-05-30 13:45:43 +00:00
Eli Friedman	63fead0f43	[ARM] Enable SETCCCARRY lowering for Thumb1. We've had Thumb1 support for ARMISD::SUBE for a while now, so this just works. Reduces codesize a bit for 64-bit integer comparisons. Differential Revision: https://reviews.llvm.org/D47387 llvm-svn: 333445	2018-05-29 18:17:16 +00:00
David Green	aee7ad0cde	Revert 333358 as it's failing on some builders. I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359	2018-05-27 12:54:33 +00:00
David Green	3034281b43	[UnrollAndJam] Add a new Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358	2018-05-27 12:11:21 +00:00
Petar Jovanovic	c051000b83	[X86][MIPS][ARM] New machine instruction property 'isMoveReg' This property is needed in order to follow values movement between registers. This property is used in TII to implement method that returns true if simple copy like instruction is recognized, along with source and destination machine operands. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D45204 llvm-svn: 333093	2018-05-23 15:28:28 +00:00
Roman Tereshin	e79d656c33	[GlobalISel][ARM] Adding HPR and QPR regclasses to FPRB regbank Also bringing ARMRegisterBankInfo::getRegBankFromRegClass implementation up to speed with the *.td-definition. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D43982 llvm-svn: 333056	2018-05-23 02:59:31 +00:00
Peter Collingbourne	dcd7d6c331	MC: Separate creating a generic object writer from creating a target object writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868	2018-05-21 19:20:29 +00:00
Peter Collingbourne	571a3301ae	MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857	2018-05-21 17:57:19 +00:00
Tim Northover	4e3eec39fa	ARM: be conservative when asked load/store alignment of weird type. Chances are we'll be asked again after type legalization, but before that point it's better to claim misaligned accesses aren't allowed than to assert. llvm-svn: 332840	2018-05-21 12:43:54 +00:00
Peter Collingbourne	f7b81db715	MC: Change the streamer ctors to take an object writer instead of a stream. NFCI. The idea is that a client that wants split dwarf would create a specific kind of object writer that creates two files, and use it to create the streamer. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47050 llvm-svn: 332749	2018-05-18 18:26:45 +00:00
Amara Emerson	0d6a26dffc	[GlobalISel][IRTranslator] Split aggregates during IR translation. We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449	2018-05-16 10:32:02 +00:00
Peter Collingbourne	ec8236ead1	ARM: Remove unnecessary argument. NFCI. IsLittleEndian is already a field of ARMAsmBackend. llvm-svn: 332420	2018-05-16 00:21:47 +00:00
Peter Collingbourne	76d463af0a	ARM: Deduplicate code and remove unnecessary declaration. NFCI. llvm-svn: 332419	2018-05-16 00:21:31 +00:00
Martin Storsjo	ace7ae935f	[ARM] Back up R4 and LR if calling the stack probe function Differential Revision: https://reviews.llvm.org/D46777 llvm-svn: 332298	2018-05-14 21:32:52 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Amaury Sechet	4f729f6a67	[ARM] Add support for SETCCCARRY instead of SETCCE Summary: As per title. SETCCE is deprecated and will eventually be removed. Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D46512 llvm-svn: 331929	2018-05-09 22:15:51 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Amaury Sechet	f91b6a8cf7	[ARM] Select result 1 from ConvertBooleanCarryToCarryFlag's result automatically. NFC The old behavior return the value 0, which is error prone. llvm-svn: 331614	2018-05-07 01:43:42 +00:00
Craig Topper	781aa181ab	Fix a bunch of places where operator-> was used directly on the return from dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577	2018-05-05 01:57:00 +00:00
Tim Northover	28e0a6f7dd	ARM: don't try to over-align large vectors as arguments. By default LLVM thinks very large vectors get aligned to their size when passed across functions. Unfortunately no-one told the ARM backend so it doesn't trigger stack realignment and so accesses can cause the usual misalignment issues (e.g. a data abort). This changes the ABI alignment to the stack alignment, which in practice (and as a bonus) also coincides with the alignment "natural" vectors get. llvm-svn: 331451	2018-05-03 12:54:25 +00:00
Clement Courbet	6794660828	[TableGen][NFC] Make ResourceCycles definitions more explicit. https://reviews.llvm.org/D46356 llvm-svn: 331439	2018-05-03 06:08:47 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Oliver Stannard	f3632143da	[ARM] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the ARM dot-product instructions introduced in v8.2-A. Differential revision: https://reviews.llvm.org/D46106 llvm-svn: 331032	2018-04-27 12:50:40 +00:00
David Green	c4cccea4c9	[ARM] Enable misched for R52. Back when the R52 schedule was added in rL286949, there was no way to enable machine schedules in ARM for specific cores. Since then a target feature has been added. This enables the feature for R52, removing the need to manually specify compiler flags. llvm-svn: 331027	2018-04-27 11:29:49 +00:00
Nico Weber	77c5471d9f	List cpp file only once (was added in 147117 and 147117 as build fix each). llvm-svn: 330587	2018-04-23 13:11:51 +00:00
Nico Weber	5d53aed419	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt llvm-svn: 330584	2018-04-23 12:49:34 +00:00
Tim Northover	271d3d2771	MachO: trap unreachable instructions Debugability is more important than saving 4 bytes to let us to fall through to nonense. llvm-svn: 330073	2018-04-13 22:25:20 +00:00
Sjoerd Meijer	834f7dc7ab	[ARM] FP16 vmaxnm/vminnm scalar instructions This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions. Differential Revision: https://reviews.llvm.org/D44675 llvm-svn: 330034	2018-04-13 15:34:26 +00:00
Ivan A. Kosarev	f533a6e5aa	[NEON] Support intrinsic for scalar and vector versions of the VRINTN instruction Differential Revision: https://reviews.llvm.org/D45514 llvm-svn: 330011	2018-04-13 12:45:12 +00:00
Sjoerd Meijer	ac96d7c4b3	[ARM] FP16 VSEL codegen This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788	2018-04-11 09:28:04 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Tim Northover	e25e458d52	Reapply ARM: Do not spill CSR to stack on entry to noreturn functions Should fix UBSan bot by also checking there's no "uwtable" attribute before skipping. Otherwise the unwind table will be useless since its moves expect CSRs to actually be preserved. A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch mostly by myeisha (pmb). llvm-svn: 329494	2018-04-07 10:57:03 +00:00
Vitaly Buka	de5f196530	Revert "ARM: Do not spill CSR to stack on entry to noreturn functions" Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM This reverts commit r329287 llvm-svn: 329486	2018-04-07 05:36:44 +00:00
Mandeep Singh Grang	9893fe218c	[ARM] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, RKSimon, MatzeB, bkramer Reviewed By: bkramer Subscribers: javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44855 llvm-svn: 329329	2018-04-05 18:31:50 +00:00
Tim Northover	b30388bf11	ARM: Do not spill CSR to stack on entry to noreturn functions A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). llvm-svn: 329287	2018-04-05 14:26:06 +00:00
Nico Weber	1cbd096914	Sort targetgen calls in lib/Target/*/CMakeLists. Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181	2018-04-04 12:37:44 +00:00
Mikhail Maltsev	68f35bcc85	[ARM] Do not convert some vmov instructions Summary: Patch https://reviews.llvm.org/D44467 implements conversion of invalid vmov instructions into valid ones. It turned out that some valid instructions also get converted, for example vmov.i64 d2, #0xff00ff00ff00ff00 -> vmov.i16 d2, #0xff00 Such behavior is incorrect because according to the ARM ARM section F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD instructions, "On assembly, the data type must be matched in the table if possible." This patch fixes the isNEONmovReplicate check so that the above instruction is not modified any more. Reviewers: rengolin, olista01 Reviewed By: rengolin Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44678 llvm-svn: 329158	2018-04-04 08:54:19 +00:00
David Blaikie	bd0c88078a	Remove some unneeded #includes to fix layering llvm-svn: 328838	2018-03-29 22:31:36 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Christof Douma	a1e77c0e02	[ARM] Support float literals under XO Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691	2018-03-28 10:02:26 +00:00
Martin Storsjo	439824622a	[ARM] Simplify constructing the ARMArchFeature string. NFC. Differential Revision: https://reviews.llvm.org/D44819 llvm-svn: 328478	2018-03-26 08:41:10 +00:00
Simon Pilgrim	351e4fa0e2	[ARM] Remove sched model instregex entries that don't match any instructions (D44687) Reviewed by @javed.absar llvm-svn: 328457	2018-03-25 19:07:17 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
David Blaikie	6054e650ff	Move TargetLoweringObjectFile from CodeGen to Target to fix layering It's implemented in Target & include from other Target headers, so the header should be in Target. llvm-svn: 328392	2018-03-23 23:58:19 +00:00
Ana Pazos	41573804f2	[ARM] Fix "Constant pool entry out of range!" in Thumb1 mode This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode. In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode, adjustBBOffsetsAfter() is not calculating postOffset correctly by properly accounting for the padding that is required for the constant pool that immediately follows the jump table branch instruction. Reviewers: t.p.northover, eli.friedman Reviewed By: t.p.northover Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44709 llvm-svn: 328341	2018-03-23 17:53:27 +00:00
Christof Douma	4a025cc79d	[ARM] Support float literals under XO When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313	2018-03-23 13:02:03 +00:00
Martin Storsjo	e1a64fe95c	[ARM] Error out on .arm assembler directives on windows Windows on arm is thumb only. Differential Revision: https://reviews.llvm.org/D43005 llvm-svn: 328298	2018-03-23 09:10:03 +00:00
Craig Topper	7ccb5ebed8	[ARM] Enable the full InstRW overlap check for ARMScheduleR52.td This fixes a few issues with the R52 instregexs to enable the full overlap checking Differential Revision: https://reviews.llvm.org/D44767 llvm-svn: 328216	2018-03-22 17:17:47 +00:00
Nirav Dave	3264c1bdf6	[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172" Reland ISel cycle checking improvements after simplifying node id invariant traversal and correcting typo. llvm-svn: 327898	2018-03-19 20:19:46 +00:00
Martin Storsjo	9a55c1b0dc	[ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897	2018-03-19 20:06:50 +00:00
Sjoerd Meijer	d16037d9bb	[ARM] Support for v4f16 and v8f16 vectors This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which uses v4f16 and v8f16 vector operands and return values. All the moving parts are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16 intrinsic. In a follow-up patch the rest of the intrinsics and tests will be added. Differential Revision: https://reviews.llvm.org/D44538 llvm-svn: 327839	2018-03-19 13:35:25 +00:00
Mikhail Maltsev	f07278ec31	[ARM] Fix warnings about missing parentheses in ARMAsmParser llvm-svn: 327827	2018-03-19 09:48:58 +00:00
Craig Topper	e1d6a4df1c	[TableGen] When trying to reuse a scheduler class for instructions from an InstRW, make sure we haven't already seen another InstRW containing this instruction on this CPU. This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check. So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag. A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found. llvm-svn: 327808	2018-03-18 19:56:15 +00:00
Nirav Dave	5f0ab71b62	Revert "[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"" as it times out building test-suite on PPC. llvm-svn: 327778	2018-03-17 19:24:54 +00:00
Nirav Dave	982d3a56ea	[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172" Reland ISel cycle checking improvements after simplifying and reducing node id invariant traversal. llvm-svn: 327777	2018-03-17 17:42:10 +00:00
Mikhail Maltsev	ed1c8bfec2	[ARM] Convert more invalid NEON immediate loads Summary: Currently the LLVM MC assembler is able to convert e.g. vmov.i32 d0, #0xabababab (which is technically invalid) into a valid instruction vmov.i8 d0, #0xab this patch adds support for vmov.i64 and for cases with the resulting load types other than i8, e.g.: vmov.i32 d0, #0xab00ab00 -> vmov.i16 d0, #0xab00 Reviewers: olista01, rengolin Reviewed By: rengolin Subscribers: rengolin, javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44467 llvm-svn: 327709	2018-03-16 14:10:56 +00:00
Mikhail Maltsev	8dcf6fa308	[ARM] Fix a check in vmov/vmvn immediate parsing Summary: Currently the check is incorrect and the following invalid instruction is accepted and incorrectly assembled: vmov.i32 d2, #0x00a500a6 This patch fixes the issue. Reviewers: olista01, rengolin Reviewed By: rengolin Subscribers: SjoerdMeijer, javed.absar, rogfer01, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44460 llvm-svn: 327704	2018-03-16 12:46:49 +00:00
Sjoerd Meijer	d391a1a985	[ARM] FP16 codegen support for VSEL This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types. Differential Revision: https://reviews.llvm.org/D44518 llvm-svn: 327695	2018-03-16 08:06:25 +00:00
Matt Arsenault	41e5ac4fa4	TargetMachine: Add address space to getPointerSize llvm-svn: 327467	2018-03-14 00:36:23 +00:00
Nirav Dave	042678bd55	Revert: r327172 "Correct load-op-store cycle detection analysis" r327171 "Improve Dependency analysis when doing multi-node Instruction Selection" r328170 "[DAG] Enforce stricter NodeId invariant during Instruction selection" Reverting patch as NodeId invariant change is causing pathological increases in compile time on PPC llvm-svn: 327197	2018-03-10 02:16:15 +00:00
Nirav Dave	071699bf82	[DAG] Enforce stricter NodeId invariant during Instruction selection Instruction Selection makes use of the topological ordering of nodes by node id (a node's operands have smaller node id than it) when doing cycle detection. During selection we may violate this property as a selection of multiple nodes may induce a use dependence (and thus a node id restriction) between two unrelated nodes. If a selected node has an unselected successor this may allow us to miss a cycle in detection an invalid selection. This patch fixes this by marking all unselected successors of a selected node have negated node id. We avoid pruning on such negative ids but still can reconstruct the original id for pruning. In-tree targets have been updated to replace DAG-level replacements with ISel-level ones which enforce this property. This preemptively fixes PR36312 before triggering commit r324359 relands Reviewers: craig.topper, bogner, jyknight Subscribers: arsenm, nhaehnle, javed.absar, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D43198 llvm-svn: 327170	2018-03-09 20:57:15 +00:00
Sjoerd Meijer	af30f06d5c	[ARM] Fix for PR36577 Don't PerformSHLSimplify if the given node is used by a node that also uses a constant because we may get stuck in an infinite combine loop. bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577 Patch by Sam Parker. Differential Revision: https://reviews.llvm.org/D44097 llvm-svn: 326882	2018-03-07 09:10:44 +00:00
Simi Pallipurath	75c6bfeac9	[ARM]Decoding MSR with unpredictable destination register causes an assert This patch handling: Enable parsing of raw encodings of system registers . Allows UNPREDICTABLE sysregs to be decoded to a raw number in the same way that disasslib does, rather than llvm crashing. Disassemble msr/mrs with unpredictable sysregs as SoftFail. Fix regression due to SoftFailing some encodings. Patch by Chris Ryder Differential revision:https://reviews.llvm.org/D43374 llvm-svn: 326803	2018-03-06 15:21:19 +00:00
Oliver Stannard	f20222a83c	[ARM][Asm] VMOVSRR and VMOVRRS need sequential S registers These instructions require that the two S registers are adjacent (but not the R registers), because only the first register is included in the encoding, but we were not checking this in the assembler. Differential revision: https://reviews.llvm.org/D44084 llvm-svn: 326696	2018-03-05 13:27:26 +00:00
Thomas Preud'homme	c699eaa311	Fix location of comment in EmitPopInst Comment about folding return in LDM was not moved along with the corresponding code in r242714. This commit fixes that. llvm-svn: 326690	2018-03-05 11:49:00 +00:00
Benjamin Kramer	4925653555	[ARM] Fold variable into assert. Avoids unused variable warnings in Release mode. llvm-svn: 326592	2018-03-02 17:39:20 +00:00
Momchil Velikov	505614bb4f	[ARM] Fix access to stack arguments when re-aligning SP in Armv6m When an Armv6m function dynamically re-aligns the stack, access to incoming stack arguments (and to stack area, allocated for register varargs) is done via SP, which is incorrect, as the SP is offset by an unknown amount relative to the value of SP upon function entry. This patch fixes it, by making access to "fixed" frame objects be done via FP when the function needs stack re-alignment. It also changes the access to "fixed" frame objects be done via FP (instead of using R6/BP) also for the case when the stack frame contains variable sized objects. This should allow more objects to fit within the immediate offset of the load instruction. All of the above via a small refactoring to reuse the existing `ARMFrameLowering::ResolveFrameIndexReference.` Differential Revision: https://reviews.llvm.org/D43566 llvm-svn: 326584	2018-03-02 15:47:14 +00:00
Florian Hahn	9deef20b6c	[ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WB Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is broken due to 2 separate bugs: 1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing rules to expand them to non pseudo MIR. These are selected for ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD. 2) Selection of the right VLD/VST instruction is broken for load and store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called with MIR opcode for fixed writeback (ie increment is access size) and call getVLDSTRegisterUpdateOpcode() to select an opcode with register writeback if base register update is of a different size. Since getVLDSTRegisterUpdateOpcode() only knows about VLD1/VLD2/VST1/VST2 the call is currently conditional on the number of element in the vector. However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not updated which later lead to a fixed writeback instruction being constructed with an extra operand for the register writeback. This patch addresses the two issues as follows: - it adds the necessary mapping from VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register to VLD1d64Twb_register and VLD1d64Qwb_register respectively. Like for the existing _fixed variants, the cost of these is bumped for unaligned access. - it changes the logic in SelectVLD and SelectVSD to call isVLDfixed and isVSTfixed respectively to decide whether the opcode should be updated. It also reworks the logic and comments for pushing the writeback offset operand and r0 operand to clarify the logic: writeback offset needs to be pushed if it's a register writeback, r0 needs to be pushed if not and the instruction is a VLD1/VLD2/VST1/VST2. Reviewers: rengolin, t.p.northover, samparker Reviewed By: samparker Patch by Thomas Preud'homme <thomas.preudhomme@arm.com> Differential Revision: https://reviews.llvm.org/D42970 llvm-svn: 326570	2018-03-02 13:02:55 +00:00
Chih-Hung Hsieh	9f9e4681ac	[TLS] use emulated TLS if the target supports only this mode Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341	2018-02-28 17:48:55 +00:00
Pablo Barrio	512f7ee315	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Martin Svanfeldt Reviewers: fhahn, pbarrio, rogfer01 Reviewed By: pbarrio, rogfer01 Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 326333	2018-02-28 17:13:07 +00:00
Andrew Zhogin	f8e88af11d	[ARM] Cortex-A57 scheduler fix for ARM backend (missed 16-bit, v8.1/v8.2/v8.3, thumb and pseudo instructions) Added missed scheduling info for ARM Cortex A57 (AArch32) to have CompleteModel with this checkCompleteness fix: https://reviews.llvm.org/D43235. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D43808 llvm-svn: 326304	2018-02-28 05:53:18 +00:00
Sjoerd Meijer	fc0d02cbbf	[ARM] Another f16 litpool fix We were always setting the block alignment to 2 bytes in Thumb mode and 4-bytes in ARM mode (r325754, and r325012), but this could cause reducing the block alignment when it already had been aligned (e.g. in Thumb mode when the block is a CPE that was already 4-byte aligned). Patch by Momchil Velikov, I've only added a test. Differential Revision: https://reviews.llvm.org/D43777 llvm-svn: 326232	2018-02-27 19:26:02 +00:00
Peter Collingbourne	e8436e8631	ARM: Don't rewrite add reg, $sp, 0 -> mov reg, $sp if the add defines CPSR. Differential Revision: https://reviews.llvm.org/D43807 llvm-svn: 326226	2018-02-27 19:00:59 +00:00
Aditya Nandakumar	599990530e	[GISel]: Don't assert when constraining RegisterOperands which are uses. Currently we assert that only non target specific opcodes can have missing RegisterClass constraints in the MCDesc. The backend can have instructions with register operands but don't have RegisterClass constraints (say using unknown_class) in which case the instruction defining the register will constrain it. Change the assert to only fire if a def has no regclass. https://reviews.llvm.org/D43409 llvm-svn: 326142	2018-02-26 22:56:21 +00:00
Geoff Berry	f8bf2ec0a8	[MachineOperand][Target] MachineOperand::isRenamable semantics changes Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931	2018-02-23 18:25:08 +00:00
Hans Wennborg	89c35fc44d	Support for the mno-stack-arg-probe flag Adds support for this flag. There is also another piece for clang (separate review). More info: https://bugs.llvm.org/show_bug.cgi?id=36221 By Ruslan Nikolaev! Differential Revision: https://reviews.llvm.org/D43107 llvm-svn: 325900	2018-02-23 13:46:25 +00:00
Sjoerd Meijer	d31a8c0595	Recommit: [ARM] f16 constant pool fix This recommits r325754; the modified and failing test case actually didn't need any modifications. llvm-svn: 325765	2018-02-22 10:43:57 +00:00
David Green	01e0f25a9f	[ARM] Fix issue with large xor constants. Fixup to rL325573 for large xor constants. Thanks to Eli Friedman for the catch. Differential revision: https://reviews.llvm.org/D43549 llvm-svn: 325761	2018-02-22 09:38:57 +00:00
Sjoerd Meijer	9a25247f80	Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy. llvm-svn: 325756	2018-02-22 08:41:55 +00:00
Sjoerd Meijer	7d5909eb0f	[ARM] f16 constant pool fix This is a follow up of r325012, that allowed half types in constant pools. Proper alignment was enforced when a big basic block was split up, but not when a CPE was placed before/after a block; the successor block had the wrong alignment. Differential Revision: https://reviews.llvm.org/D43580 llvm-svn: 325754	2018-02-22 08:16:05 +00:00
Hiroshi Inoue	7f9f92f8b6	[NFC] fix trivial typos in comments "a a" -> "a" llvm-svn: 325752	2018-02-22 07:48:29 +00:00
Sjoerd Meijer	4d5c40492a	[ARM] Lower BR_CC for f16 This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616	2018-02-20 19:28:05 +00:00
David Green	056476497e	[ARM] Mark -1 as cheap in xor's for thumb1 We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1 would not otherwise be considered a cheap constant. This prevents the -1's from being pulled out into constants and potentially hoisted. Differential Revision: https://reviews.llvm.org/D43451 llvm-svn: 325573	2018-02-20 11:07:35 +00:00
Jonas Paulsson	995ba6e42c	[ARM] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Eli Friedman llvm-svn: 325327	2018-02-16 09:51:01 +00:00
Roger Ferrer Ibanez	d41059a9f6	[ARM] Materialise some boolean values to avoid a branch This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form a != b ? x : 0 a == b ? 0 : x and that currently (e.g. in Thumb1) are emitted as branches. Differential Revision: https://reviews.llvm.org/D34515 llvm-svn: 325323	2018-02-16 09:23:59 +00:00
Pablo Barrio	e28cb8399a	[ARM] Allow 64- and 128-bit types with 't' inline asm constraint Summary: In LLVM, 't' selects a floating-point/SIMD register and only supports 32-bit values. This is appropriately documented in the LLVM Language Reference Manual. However, this behaviour diverges from that of GCC, where 't' selects the s0-s31 registers and its qX and dX variants depending on additional operand modifiers (q/P). For example, the following C code: #include <arm_neon.h> float32x4_t a, b, x; asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b)) results in the following assembly if compiled with GCC: vadd.f32 s0, s0, s1 whereas LLVM will show "error: couldn't allocate output register for constraint 't'", since a, b, x are 128-bit variables, not 32-bit. This patch extends the use of 't' to mean that of GCC, thus allowing selection of the lower Q vector regs and their D/S variants. For example, the earlier code will now compile as: vadd.f32 q0, q0, q1 This behaviour still differs from that of GCC but I think it is actually more correct, since LLVM picks up the right register type based on the datatype of x, while GCC would need an extra operand modifier to achieve the same result, as follows: asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b)) Since this is only an extension of functionality, existing code should not be affected by this change. Note that operand modifiers q/P are already supported by LLVM, so this patch should suffice to support inline assembly with constraint 't' originally built for GCC. Reviewers: grosbach, rengolin Reviewed By: rengolin Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42962 llvm-svn: 325244	2018-02-15 14:44:22 +00:00
Sjoerd Meijer	9430c8cd1c	[ARM] f16 vcmp fixes This adds f16 VCMP match rules and fixes the test cases. Differential Revision: https://reviews.llvm.org/D43291 llvm-svn: 325228	2018-02-15 10:33:07 +00:00
Sjoerd Meijer	3b4294edd2	[ARM] f16 stack spill/reloads This adds support for handling f16 stack spills/reloads. Differential Revision: https://reviews.llvm.org/D43280 llvm-svn: 325130	2018-02-14 15:09:09 +00:00
Sjoerd Meijer	f4a7fa7bbe	[ARM] Allow half types in ConstantPool Change ARMConstantIslandPass to: - accept f16 literals as litpool entries, - if the litpool needs to be inserted in the middle of a big block, then we need to 4-byte align the next instruction in ARM mode. Differential Revision: https://reviews.llvm.org/D42784 llvm-svn: 325012	2018-02-13 15:34:09 +00:00
Andre Vieira	f00234c0bf	[ARM] Don't print "Requires NEON" error message for M-profile Differential Revision: https://reviews.llvm.org/D43125 llvm-svn: 325000	2018-02-13 11:46:38 +00:00
Sjoerd Meijer	101ee43072	[Thumb] Handle addressing mode AddrMode5FP16 This addressing mode wasn't checked, so we were running in an assert. Differential Revision: https://reviews.llvm.org/D43179 llvm-svn: 324996	2018-02-13 10:29:03 +00:00
Daniel Neilson	7512c3e15f	[ARMFastISel] Replace deprecated calls to MemoryIntrinsic::getAlignment() (NFCI) This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes ARMFastISel to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324781	2018-02-09 23:31:37 +00:00
Oliver Stannard	133b6085e8	[ARM] Re-commit r324600 with fixed LLVMBuild.txt ARMDisassembler now depends on the banked register tables in ARMUtils, so the LLVMBuild.txt needed updating to reflect this. Original commit mesage: [ARM] Fix disassembly of invalid banked register moves When disassembling banked register move instructions, we don't have an assembly syntax for the unallocated register numbers, so we have to return Fail rather than SoftFail. Previously we were returning SoftFail, then crashing in the InstPrinter as we have no way to represent these encodings in an assembly string. This also switches the decoder to use the table-generated list of banked registers, removing the duplicated list of encodings. Differential revision: https://reviews.llvm.org/D43066 llvm-svn: 324606	2018-02-08 14:31:22 +00:00
Oliver Stannard	3c11ecbbab	Revert r324600 as it breaks a buildbot The broken bot (clang-ppc64le-linux-multistage) is doign a shared-object build, so I guess using lookupBankedRegByEncoding in the disassembler is a layering violation? llvm-svn: 324604	2018-02-08 14:21:28 +00:00
Oliver Stannard	db982b25ff	[ARM] Fix disassembly of invalid banked register moves When disassembling banked register move instructions, we don't have an assembly syntax for the unallocated register numbers, so we have to return Fail rather than SoftFail. Previously we were returning SoftFail, then crashing in the InstPrinter as we have no way to represent these encodings in an assembly string. This also switches the decoder to use the table-generated list of banked registers, removing the duplicated list of encodings. Differential revision: https://reviews.llvm.org/D43066 llvm-svn: 324600	2018-02-08 13:06:08 +00:00
Peter Collingbourne	559ff1fe03	ARM: Remove dead code. NFCI. llvm-svn: 324565	2018-02-08 05:28:39 +00:00
Sjoerd Meijer	8c0739347c	[ARM] FP16 mov imm pattern This is a follow up of r324321, adding a match pattern for mov with a FP16 immediate (also fixing operand vfp_f16imm that wasn't even compiling). Differential Revision: https://reviews.llvm.org/D42973 llvm-svn: 324456	2018-02-07 08:37:17 +00:00
Sjoerd Meijer	d2718ba95e	[ARM] f16 conversions This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion match patterns. Differential Revision: https://reviews.llvm.org/D42954 llvm-svn: 324360	2018-02-06 16:28:43 +00:00
Oliver Stannard	ee0ac39305	[ARM][AArch64] Add CSDB speculation barrier instruction This adds the CSDB instruction, which is a new barrier instruction described by the whitepaper at [1]. This is in encoding space which was previously executed as a NOP, so it is available for all targets that have the relevant NOP encoding space. This matches the binutils behaviour for these instructions [2][3]. [1] https://developer.arm.com/support/security-update [2] https://sourceware.org/ml/binutils/2018-01/msg00116.html [3] https://sourceware.org/ml/binutils/2018-01/msg00120.html llvm-svn: 324324	2018-02-06 09:24:47 +00:00
Sjoerd Meijer	89ea2648bb	[ARM] Armv8.2-A FP16 code generation (part 3/3) This adds most of the FP16 codegen support, but these areas need further work: - FP16 literals and immediates are not properly supported yet (e.g. literal pool needs work), - Instructions that are generated from intrinsics (e.g. vabs) haven't been added. This will be addressed in follow-up patches. Differential Revision: https://reviews.llvm.org/D42849 llvm-svn: 324321	2018-02-06 08:43:56 +00:00
Sjoerd Meijer	9d9a86535e	[ARM] FullFP16 LowerReturn Fix Commit r323512 introduced an optimisation in LowerReturn for half-precision return values. A missing check caused a crash when the return value is "undef" (i.e. a node that has no operands). Differential Revision: https://reviews.llvm.org/D42743 llvm-svn: 323968	2018-02-01 13:48:40 +00:00
Yvan Roux	490e9e6761	[ARM] Add support for unpredictable MVN instructions. This fixes bugzilla 33011 https://bugs.llvm.org/show_bug.cgi?id=33011 Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in sections A8.8.116 and A8.8.117. It fixes also the usage of PC register as destination register for MVN register-shifted register version as specified in A8.8.117. Differential Revision: https://reviews.llvm.org/D41905 llvm-svn: 323954	2018-02-01 12:06:57 +00:00
Yvan Roux	705e26a243	Test commit: Fix a comment. llvm-svn: 323947	2018-02-01 08:39:58 +00:00
Evgeniy Stepanov	7746899f48	Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations" Miscompiles code. Testcase pending. This reverts commit r323869. llvm-svn: 323929	2018-01-31 22:55:19 +00:00
Diana Picus	12ed95e3e7	Fix formatting for r323876. NFC llvm-svn: 323878	2018-01-31 15:16:17 +00:00
Diana Picus	1d4421f6a6	[ARM GlobalISel] Modernize LegalizerInfo. NFCI Start using the new LegalizerInfo API introduced in r323681. Keep the old API for opcodes that need Lowering in some circumstances (G_FNEG and G_UREM/G_SREM). llvm-svn: 323876	2018-01-31 14:55:07 +00:00
Pablo Barrio	2e442a7831	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869	2018-01-31 13:20:10 +00:00
Sjoerd Meijer	98d5359ea2	[ARM] Armv8.2-A FP16 code generation (part 2/3) Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861	2018-01-31 10:18:29 +00:00
Roger Ferrer Ibanez	aea4208720	[ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR. In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do copies CPSR ↔ GPR but not all Thumb1 targets implement them. The schedule can attempt, before attempting a copy, to clone the instructions but it does not currently do that for nodes with input glue. In this patch we introduce a target-hook to let the hook decide if a glued machinenode is still eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS . As a follow-up of this change we should actually implement the copies for the Thumb1 targets that do implement them and restrict the hook to the targets that can't really do such copy as these clones are not ideal. This change fixes PR35836. Differential Revision: https://reviews.llvm.org/D42051 llvm-svn: 323857	2018-01-31 09:23:43 +00:00
Diana Picus	2a5b962030	[ARM GlobalISel] Map G_SITOFP and G_UITOFP Straightforward mapping (integer operand to GPR, floating point operand to FPR). llvm-svn: 323731	2018-01-30 09:15:23 +00:00
Diana Picus	517531e5a5	[ARM GlobalISel] Legalize G_SITOFP and G_UITOFP Legal if we have hardware support, libcall otherwise. Also add supporting code to the legalizer helper for libcalls. llvm-svn: 323730	2018-01-30 09:15:17 +00:00
Diana Picus	a2da03022c	[ARM GlobalISel] Map G_FPTOSI and G_FPTOUI Straightforward mapping (integer operand goes to GPR, floating point operand goes to FPR). llvm-svn: 323727	2018-01-30 07:54:58 +00:00
Diana Picus	4ed0ee7b5f	[ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI Legal if we have hardware support for floating point, libcalls otherwise. Also add the necessary support for libcalls in the legalizer helper. llvm-svn: 323726	2018-01-30 07:54:52 +00:00
Daniel Sanders	08464524c3	[ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern Summary: Apparently, we missed on constraining register classes of VReg-operands of all the instructions built from a destination pattern but the root (top-level) one. The issue exposed itself while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained, while nested VTOSIZS (or rather its destination virtual register to be exact) does not. Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI. https://bugs.llvm.org/show_bug.cgi?id=35965 rdar://problem/36886530 Patch by Roman Tereshin Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan Reviewed By: dsanders, qcolombet, rovka Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42565 llvm-svn: 323692	2018-01-29 21:09:12 +00:00
Daniel Sanders	9ade5592d9	[globalisel] Make LegalizerInfo::LegalizeAction available outside of LegalizerInfo. NFC Summary: The improvements to the LegalizerInfo discussed in D42244 require that LegalizerInfo::LegalizeAction be available for use in other classes. As such, it needs to be moved out of LegalizerInfo. This has been done separately to the next patch to minimize the noise in that patch. llvm-svn: 323669	2018-01-29 17:37:29 +00:00
Sjoerd Meijer	3ddb7fb663	[ARM] FP16Pat and FullFP16Pat patterns. NFC. Create and use FP16Pat FullFP16Pat helper patterns to make the difference explicit. Differential Revision: https://reviews.llvm.org/D42634 llvm-svn: 323640	2018-01-29 11:28:06 +00:00
Momchil Velikov	d2cc6fd90b	[ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514	2018-01-26 10:20:58 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Weiming Zhao	665784f170	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354	2018-01-24 18:00:57 +00:00
Martin Storsjo	4ed94a06ac	[ARM] Call __chkstk for dynamic stack allocation in all windows environments This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308	2018-01-24 06:40:11 +00:00
Joel Galenson	1d89cd2bb4	[ARM] Cleanup part of ARMBaseInstrInfo::optimizeCompareInstr (NFCI). As noted in another review, this loop is confusing. This commit cleans it up somewhat. Differential Revision: https://reviews.llvm.org/D42312 llvm-svn: 323136	2018-01-22 17:53:47 +00:00
Marina Yatsina	0bf841ac2a	Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40333 Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56 llvm-svn: 323095	2018-01-22 10:06:50 +00:00
Marina Yatsina	3d8efa4f0c	Rename ExecutionDepsFix files to ExecutionDomainFix This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40332 Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8 llvm-svn: 323094	2018-01-22 10:06:33 +00:00
Marina Yatsina	6fc2aaae8d	Separate ExecutionDepsFix into 4 parts: 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087	2018-01-22 10:05:23 +00:00
Joel Galenson	dbc724f764	[ARM] Fix perf regression in compare optimization. Fix a performance regression caused by r322737. While trying to make it easier to replace compares with existing adds and subtracts, I accidentally stopped it from doing so in some cases. This should fix that. I'm also fixing another potential bug in that commit. Differential Revision: https://reviews.llvm.org/D42263 llvm-svn: 322972	2018-01-19 17:46:27 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Reid Kleckner	1aa9061c5f	[CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64 Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788	2018-01-17 23:55:23 +00:00
Joel Galenson	bbcaf4ac5c	[ARM] Optimize {s,u}mul.with.overflow. This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738	2018-01-17 19:19:05 +00:00
Joel Galenson	fe7fa40869	[ARM] Optimize {s,u}{add,sub}.with.overflow. The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737	2018-01-17 19:19:05 +00:00
Diana Picus	01bcfd2112	[ARM GlobalISel] Rename local variable. NFC llvm-svn: 322667	2018-01-17 15:25:37 +00:00
Diana Picus	c62a16234b	[ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR llvm-svn: 322657	2018-01-17 14:14:14 +00:00
Diana Picus	65ed364fac	[ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651	2018-01-17 13:34:10 +00:00
Diana Picus	2dc5405693	[ARM GlobalISel] Map G_FMA to FPR llvm-svn: 322367	2018-01-12 12:06:01 +00:00
Diana Picus	e74243d473	[ARM GlobalISel] Legalize G_FMA For hard float with VFP4, it is legal. Otherwise, we use libcalls. This needs a bit of support in the LegalizerHelper for soft float because we didn't handle G_FMA libcalls yet. The support is trivial, as the only difference between G_FMA and other libcalls that we already handle is that it has 3 input operands rather than just 2. llvm-svn: 322366	2018-01-12 11:30:45 +00:00
Andre Vieira	5627c218e1	[ARM] Add codegen for SMMULR, SMMLAR and SMMLSR This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR instructions from equivalent IR patterns. Differential Revision: https://reviews.llvm.org/D41775 llvm-svn: 322361	2018-01-12 09:24:41 +00:00
Andre Vieira	26b9de9ebb	[ARM] Fix erroneous availability of SMMLS for Armv7-M Differential Revision: https://reviews.llvm.org/D41855 llvm-svn: 322360	2018-01-12 09:21:09 +00:00
Matthias Braun	ea4359e922	PeepholeOptimizer: Fix for vregs without defs The PeepholeOptimizer would fail for vregs without a definition. If this was caused by an undef operand abort to keep the code simple (so we don't need to add logic everywhere to replicate the undef flag). Differential Revision: https://reviews.llvm.org/D40763 llvm-svn: 322319	2018-01-11 22:30:43 +00:00
Evgeniy Stepanov	5223b5d9d6	[arm] Implement Target Operand Flag MIR serialization. Reviewers: efriedma, pcc Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39975 llvm-svn: 322312	2018-01-11 21:37:58 +00:00
Diana Picus	0ed7513c83	[ARM GlobalISel] Map G_FNEG to the FPR bank llvm-svn: 322169	2018-01-10 11:13:31 +00:00
Diana Picus	f949a0abac	[ARM GlobalISel] Legalize G_FNEG for s32 and s64 For hard float, it is legal. For soft float, we need to lower to 0 - x first, and then we can use the libcall for G_FSUB. This is undoing some of the canonicalization performed by the IRTranslator (which introduces G_FNEG when it sees a 0 - x). Ideally, that canonicalization would be performed by a pre-legalizer pass that would allow targets to opt out of this behaviour rather than dance around it in the legalizer. llvm-svn: 322168	2018-01-10 10:45:34 +00:00
Diana Picus	8f14886630	[ARM GlobalISel] Legalize s32/s64 G_FCONSTANT Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164	2018-01-10 10:01:49 +00:00
Diana Picus	734a5e8912	[ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162	2018-01-10 09:32:01 +00:00
Francis Visoiu Mistrih	7d9bef8f5c	[CodeGen] Don't print "pred:" and "opt:" in -debug output In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096	2018-01-09 17:31:07 +00:00
Momchil Velikov	ac7c5c1d92	[ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996	2018-01-08 14:47:19 +00:00

1 2 3 4 5 ...

9736 Commits