llvm-project

Commit Graph

Author	SHA1	Message	Date
Mikhail Maltsev	4a9e3f15bb	[ARM] MVE: support QQPRRegClass and QQQQPRRegClass Summary: QQPRRegClass and QQQQPRRegClass are used by the interleaving/deinterleaving loads/stores to represent sequences of consecutive SIMD registers. Reviewers: ostannard, simon_tatham, dmgreen Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64009 llvm-svn: 364794	2019-07-01 16:05:23 +00:00
Sam Parker	98722691b0	[ARM] WLS/LE Code Generation Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733	2019-07-01 08:21:28 +00:00
Sam Tebbs	e39e958da3	[ARM] Add support for the MVE long shift instructions MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers. The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl. test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions. Differential Revision: https://reviews.llvm.org/D63430 llvm-svn: 364654	2019-06-28 15:43:31 +00:00
David Green	2883944035	[ARM] Mark math routines as non-legal for MVE This adds handling and tests for a number of floating point math routines, which have no MVE instructions. Differential Revision: https://reviews.llvm.org/D63725 llvm-svn: 364641	2019-06-28 11:17:38 +00:00
David Green	eb7080ac6e	[ARM] Widening loads and narrowing stores MVE has instructions to widen as it loads, and narrow as it stores. This adds the required patterns and legalisation to make them work including specifying that they are legal, patterns to select them and test changes. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63839 llvm-svn: 364636	2019-06-28 09:47:55 +00:00
David Green	07e53fee14	[ARM] MVE loads and stores This fills in the gaps for basic MVE loads and stores, allowing unaligned access and adding far too many tests. These will become important as narrowing/expanding and pre/post inc are added. Big endian might still not be handled very well, because we have not yet added bitcasts (and I'm not sure how we want it to work yet). I've included the alignment code anyway which maps with our current patterns. We plan to return to that later. Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D63838 llvm-svn: 364633	2019-06-28 08:41:40 +00:00
David Green	fc4102417b	[ARM] Mark div and rem as expand for MVE We don't have vector operations for these, so they need to be expanded for both integer and float. Differential Revision: https://reviews.llvm.org/D63595 llvm-svn: 364631	2019-06-28 08:18:55 +00:00
David Green	8be372b190	[ARM] MVE vector shuffles This patch adds necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like upcoming patches). This mostly uses the same code from Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific. Most code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63567 llvm-svn: 364626	2019-06-28 07:08:42 +00:00
Sam Tebbs	8747c5f482	[ARM] Fix formatting issue in ARMISelLowering.cpp Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test commit after receiving write access. llvm-svn: 364560	2019-06-27 16:28:28 +00:00
Eli Friedman	ab1d73ee32	[ARM] Don't reserve R12 on Thumb1 as an emergency spill slot. The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 llvm-svn: 364490	2019-06-26 23:46:51 +00:00
Fangrui Song	6a4c68e187	[ARM] Fix -Wimplicit-fallthrough after D60709/r364331 llvm-svn: 364376	2019-06-26 02:34:10 +00:00
Simon Tatham	e8de8ba6a6	[ARM] Support inline assembler constraints for MVE. "To" selects an odd-numbered GPR, and "Te" an even one. There are some 8.1-M instructions that have one too few bits in their register fields and require registers of particular parity, without necessarily using a consecutive even/odd pair. Also, the constraint letter "t" should select an MVE q-register, when MVE is present. This didn't need any source changes, but some extra tests have been added. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60709 llvm-svn: 364331	2019-06-25 16:49:32 +00:00
Simon Tatham	a4b415a683	[ARM] Code-generation infrastructure for MVE. This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and if you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329	2019-06-25 16:48:46 +00:00
Fangrui Song	807d2f442a	[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692 llvm-svn: 364312	2019-06-25 13:28:44 +00:00
Simon Tatham	4cf18c2849	[ARM] Explicit lowering of half <-> double conversions. If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 llvm-svn: 364294	2019-06-25 11:24:50 +00:00
Simon Pilgrim	4e0648a541	[TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179	2019-06-12 17:14:03 +00:00
David Green	c5471c2a57	[ARM] Adjust isLegalT1AddressImmediate for non-legal types Types such as float and i64's do not have legal loads in Thumb1, but will still be loaded with a LDR (or potentially multiple LDR's). As such we can treat the cost of addressing mode calculations the same as an i32 and get some optimisation benefits. Differential Revision: https://reviews.llvm.org/D62968 llvm-svn: 362874	2019-06-08 10:32:53 +00:00
David Green	342d1b81a3	[ARM] Add MVE addressing to isLegalT2AddressImmediate Now with MVE being added, we can add the vector addressing mode costs for it. These are generally imm7 multiplied by the size of the type being loaded / stored. Differential Revision: https://reviews.llvm.org/D62967 llvm-svn: 362873	2019-06-08 10:18:23 +00:00
David Green	4ecce205d5	[ARM] Add fp16 addressing to isLegalT2AddressImmediate The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs to account for those, and adds extra testing. It is dependant upon hasFPRegs16 as this is what the load/store instructions require. Differential Revision: https://reviews.llvm.org/D62966 llvm-svn: 362872	2019-06-08 10:09:02 +00:00
Diogo N. Sampaio	df92f84110	[ARM][FIX] Ran out of registers due tail recursion Summary: - pr42062 When compiling for MinSize, ARMTargetLowering::LowerCall decides to indirect multiple calls to a same function. However, it disconsiders the limitation that thumb1 indirect calls require the callee to be in a register from r0 to r3 (llvm limiation). If all those registers are used by arguments, the compiler dies with "error: run out of registers during register allocation". This patch tells the function IsEligibleForTailCallOptimization if we intend to perform indirect calls, as to avoid tail call optimization. Reviewers: dmgreen, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62683 llvm-svn: 362366	2019-06-03 08:58:05 +00:00
Simon Pilgrim	5359bb4d31	[ARM] LowerVECTOR_SHUFFLE - fix uninitialized variable warnings. NFCI. llvm-svn: 362094	2019-05-30 14:01:24 +00:00
Simon Tatham	760df47b77	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845	2019-05-28 16:13:20 +00:00
Alexander Timofeev	ba447bae74	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741	2019-05-26 20:33:26 +00:00
David Green	21542cd6f4	[ARM] Select a number of fp16 rounding functions This add patterns for fp16 round and ceil etc. Same as the float and double patterns. Differential Revision: https://reviews.llvm.org/D62326 llvm-svn: 361718	2019-05-26 11:13:00 +00:00
David Green	c9f4b7d201	[ARM] Promote various fp16 math intrinsics Promote a number of fp16 math intrinsics to float, so that the relevant float math routines can be used. Copysign is expanded so as to be handled in-place. Differential Revision: https://reviews.llvm.org/D62325 llvm-svn: 361717	2019-05-26 10:59:21 +00:00
David Green	caf8a11b65	[ARM] Promote fp16 frem Promote fp16 frem operations on ARM to floats so they call fmodf. Differential Revision: https://reviews.llvm.org/D62321 llvm-svn: 361713	2019-05-26 10:30:22 +00:00
Peter Collingbourne	3b93737446	Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688	2019-05-25 01:52:38 +00:00
Alexander Timofeev	dffedea014	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644	2019-05-24 15:32:18 +00:00
David Green	0582b22f10	[ARM] Don't use the Machine Scheduler for cortex-m at minsize The new cortex-m schedule in rL360768 helps performance, but can increase the amount of high-registers used. This, on average, ends up increasing the codesize by a fair amount (because less instructions are converted from T2 to T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use the existing DAG scheduler with the RegPressure scheduling preference (at least until the issues around T2 vs T1 instructions can be improved). I have also made sure that the Sched::RegPressure dag scheduler is always chosen for MinSize. The test shows one case where we increase the number of registers used. Differential Revision: https://reviews.llvm.org/D61882 llvm-svn: 360769	2019-05-15 12:58:02 +00:00
Eli Friedman	2ea088173d	[ARM] Glue register copies to tail calls. This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 llvm-svn: 360099	2019-05-06 23:21:59 +00:00
Sjoerd Meijer	180f1ae57c	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537	2019-04-30 08:38:12 +00:00
David Green	0d741507f7	[ARM] Rewrite isLegalT2AddressImmediate This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845	2019-04-21 09:54:29 +00:00
Simon Pilgrim	e5573f4f4e	[TargetLowering] Rename preferShiftsToClearExtremeBits and shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526	2019-04-16 20:57:28 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Eli Friedman	3813fe0bda	[ARM] Optimize expressions like "return x != 0;" for Thumb1. There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 llvm-svn: 357437	2019-04-02 00:01:23 +00:00
Eli Friedman	1e5d569c8c	[ARM] Add missing memory operands to a bunch of instructions. This should hopefully lead to minor improvements in code generation, and more accurate spill/reload comments in assembly. Also fix isLoadFromStackSlotPostFE/isStoreToStackSlotPostFE so they don't lead to misleading assembly comments for merged memory operands; this is technically orthogonal, but in practice the relevant memory operand lists don't show up without this change. Differential Revision: https://reviews.llvm.org/D59713 llvm-svn: 356963	2019-03-25 22:42:30 +00:00
Simon Pilgrim	87d261bfd3	[Thumb] Fix infinite loop in ABS expansion (PR41160) Don't expand ISD::ABS node if its legal. llvm-svn: 356661	2019-03-21 12:41:18 +00:00
Simon Pilgrim	7a8e5051f4	Fix unused variable warning. NFCI. llvm-svn: 356474	2019-03-19 16:49:59 +00:00
Simon Pilgrim	a56f2822d0	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468	2019-03-19 16:24:55 +00:00
Adhemerval Zanella	664c1ef528	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389	2019-03-18 18:40:07 +00:00
Tim Renouf	d1477e989c	[ARM] Fixed an assumption of power-of-2 vector MVT I am about to introduce some non-power-of-2 width vector MVTs. This commit fixes a power-of-2 assumption that my forthcoming change would otherwise break, as shown by test/CodeGen/ARM/vcvt_combine.ll and vdiv_combine.ll. Differential Revision: https://reviews.llvm.org/D58927 Change-Id: I56a282e365d3874ab0621e5bdef98a612f702317 llvm-svn: 356341	2019-03-17 20:48:54 +00:00
Florian Hahn	13bbcb3264	[ARM] Sink zext/sext operands for add and sub to enable vsubl generation. This uses the infrastructure added in rL353152 to sink zext and sexts to sub/add users, to enable vsubl/vaddl generation when NEON is available. See https://bugs.llvm.org/show_bug.cgi?id=40025. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D58063 llvm-svn: 355460	2019-03-06 00:10:03 +00:00
Oliver Stannard	4a9086b537	[ARM] Fix select_cc lowering for fp16 When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 llvm-svn: 355385	2019-03-05 10:42:34 +00:00
Oliver Stannard	e019e6223b	[ARM] Consider undefined-on-NaN conditions in checkVSELConstraints This function was not checking for the condition code variants which are undefined if either input is NaN, so we were missing selection of the VSEL instruction in some cases when using -fno-honor-nans or -ffast-math. Differential revision: https://reviews.llvm.org/D58812 llvm-svn: 355199	2019-03-01 13:58:25 +00:00
Kadir Cetinkaya	1b1b1a6135	[Target][ARM] Add a usage for SrcSz to unbreak build-bots without assertions llvm-svn: 355101	2019-02-28 15:55:11 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Sam Parker	5b09834bc3	[ARM] Add OptMinSize to ARMSubtarget In many places in the backend, we like to know whether we're optimising for code size and this is performed by checking the current machine function attributes. A subtarget is created on a per-function basis, so it's possible to know when we're compiling for code size on construction so record this in the new object. Differential Revision: https://reviews.llvm.org/D57812 llvm-svn: 353501	2019-02-08 07:57:42 +00:00
James Y Knight	7716075a17	[opaque pointer types] Pass value type to GetElementPtr creation. This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913	2019-02-01 20:44:47 +00:00
Reid Kleckner	27fd307b83	[ARM] Deduplicate table generated CC analysis code Create ARMCallingConv.cpp and emit code for calling convention analysis from there. llvm-svn: 352431	2019-01-28 21:28:43 +00:00
Matt Arsenault	39508331ef	Reapply "IR: Add fp operations to atomicrmw" This reapplies commits r351778 and r351782 with RISCV test fixes. llvm-svn: 351850	2019-01-22 18:18:02 +00:00

1 2 3 4 5 ...

1674 Commits