llvm-project

Commit Graph

Author	SHA1	Message	Date
Serguei Katkov	df25787797	[GreedyRA ORE] Extract stats in RAGreedyStats struct. NFC. Combine all collected stats into separate struct RAGreedyStats with add and report methods. The motivation is to extend the number of statistics to capture and instead of adding new parameters, just combine all of them into one structure. Additionally I plan to use report from different places in future to report data for function as well. Reviewers: reames, MatzeB, anemet, thegameg Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D100012	2021-04-08 14:27:37 +07:00
Serguei Katkov	0a1c6637a1	[GreedyRA ORE] Compute ORE stats if extra analysis is enabled To save compile time, avoid computation of stats if ORE will not emit it. The motivation is to add more stats and compute it only if it will dumped. Reviewers: reames, MatzeB, anemet, thegameg Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D100010	2021-04-08 14:24:18 +07:00
Esme-Yi	0c36da722a	[Debug-Info] Use inlined strings in .dwinfo section by default for DBX. Summary: Set the default DwarfInlinedStrings as inlined strings for DBX, due to DBX does not support .dwstr section for now. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99933	2021-04-08 07:20:22 +00:00
Hongtao Yu	2a2720a2de	[CSSPGO] Move pseudo probes to the beginning of a block to unblock SelectionDAG combine. Pseudo probes, when scattered in a block, can be chained dependencies of other regular DAG nodes and block DAG combine optimizations. To fix this, scattered probes in a block are grouped and placed at the beginning of the block. This shouldn't affect the profile quality. Test Plan: Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D100002	2021-04-07 22:45:35 -07:00
Arthur Eubanks	90af134473	Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()" This reverts commit `9583a3f262`. This wasn't NFC as initially thought. Needed for D99707.	2021-04-07 11:40:44 -07:00
Craig Topper	67953311e2	[SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR This allows FoldConstantArithmetic to handle SPLAT_VECTOR in addition to BUILD_VECTOR. This allows it to support scalable vectors. I'm also allowing fixed length SPLAT_VECTOR which is used by some targets, but I'm not familiar enough to write tests for those targets. I had to block this function from running on CONCAT_VECTORS to avoid calling getNode for a CONCAT_VECTORS of 2 scalars. This can happen because the 2 operand getNode calls this function for any opcode. Previously we were protected because CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR before that call. But it's not always possible to fold a CONCAT_VECTORS of SPLAT_VECTORs, and we don't even try. This fixes PR49781 where DAG combine thought constant folding should be possible, but FoldConstantArithmetic couldn't do it. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D99682	2021-04-07 10:03:33 -07:00
Yevgeny Rouban	3e738afae4	[Statepoint Lowering] Allow other than N byte sized types in deopt bundle I do not see any bit-width restriction from the point of the LLVM Lang Ref - Operand Bundles on the types of the deopt bundle operands. Statepoint Lowering seems to be able to work with any types. This patch relaxes the two related assertions and adds a new test for this change. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100006	2021-04-07 17:48:31 +07:00
Nicolás Alvarez	a1aada75f5	[docs] Fix doxygen comments wrongly attached to the llvm namespace Looking at the Doxygen-generated documentation for the llvm namespace currently shows all sorts of random comments from different parts of the codebase. These are mostly caused by: - File doc comments that aren't marked with \file, so they're attached to the next declaration, which is usually "namespace llvm {". - Class doc comments placed before the namespace rather than before the class. - Code comments before the namespace that (in my opinion) shouldn't be extracted by doxygen at all. This commit fixes these comments. The generated doxygen documentation now has proper docs for several classes and files, and the docs for the llvm and llvm::detail namespaces are now empty. Reviewed By: thakis, mizvekov Differential Revision: https://reviews.llvm.org/D96736	2021-04-07 01:20:18 +02:00
Philip Reames	908215b346	Use AssumeInst in a few more places [nfc] Follow up to `a6d2a8d6f5`. These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.	2021-04-06 13:18:53 -07:00
Philip Reames	fb41cae039	More precisely type code used for gc.relocate assertions [nfc]	2021-04-06 11:27:36 -07:00
Abhina Sreeskantharajan	82b3e28e83	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text Problem: On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable. Solution: This patch adds two new flags - OF_CRLF which indicates that CRLF translation is used. - OF_TextWithCRLF = OF_Text \| OF_CRLF indicates that the file is text and uses CRLF translation. Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF. So this is the behaviour per platform with my patch: z/OS: OF_None: open in binary mode OF_Text : open in text mode OF_TextWithCRLF: open in text mode Windows: OF_None: open file with no carriage return OF_Text: open file with no carriage return OF_TextWithCRLF: open file with carriage return The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set. ``` if (Flags & OF_CRLF) CrtOpenFlags \|= _O_TEXT; ``` These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows. ./llvm/lib/Support/raw_ostream.cpp ./llvm/lib/TableGen/Main.cpp ./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp ./llvm/unittests/Support/Path.cpp ./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp ./clang/lib/Frontend/CompilerInstance.cpp ./clang/lib/Driver/Driver.cpp ./clang/lib/Driver/ToolChains/Clang.cpp Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99426	2021-04-06 07:23:31 -04:00
Simon Pilgrim	ddbb58736a	[KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI. As promised in D98866	2021-04-06 10:11:41 +01:00
Serguei Katkov	0057ec8034	[Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D99875	2021-04-06 11:44:37 +07:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Simon Pilgrim	4ea5475a3f	[KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI. Include exhaustive test coverage.	2021-04-02 21:44:33 +01:00
Jun Ma	274ac9d40e	[AArch64][SVE] Lowering sve.dot to DOT node Differential Revision: https://reviews.llvm.org/D99699	2021-04-02 20:05:17 +08:00
Sander de Smalen	0f7bbbc481	Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed. In order to bring up scalable vector support in LLVM incrementally, we introduced behaviour to emit a warning, instead of an error, when asking the wrong question of a scalable vector, like asking for the fixed number of elements. This patch puts that behaviour under a flag. The default behaviour is that the compiler will always error, which means that all LLVM unit tests and regression tests will now fail when a code-path is taken that still uses the wrong interface. The behaviour to demote an error to a warning can be individually enabled for tools that want to support experimental use of scalable vectors. This patch enables that behaviour when driving compilation from Clang. This means that for users who want to try out scalable-vector support, fixed-width codegen support, or build user-code with scalable vector intrinsics, Clang will not crash and burn when the compiler encounters such a case. This allows us to do away with the following pattern in many of the SVE tests: RUN: .... 2>%t RUN: cat %t \| FileCheck --check-prefix=WARN WARN-NOT: warning: ... The behaviour to emit warnings is only temporary and we expect this flag to be removed in the future when scalable vector support is more stable. This patch also has fixes the following tests: unittests: ScalableVectorMVTsTest.SizeQueries SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG regression tests: Transforms/InstCombine/vscale_gep.ll Reviewed By: paulwalker-arm, ctetreau Differential Revision: https://reviews.llvm.org/D98856	2021-04-02 10:55:22 +01:00
Mircea Trofin	ce61def529	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-04-01 08:33:28 -07:00
Simon Pilgrim	77d625f8d8	[DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well. This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.	2021-04-01 14:33:00 +01:00
Simonas Kazlauskas	777a58e05b	Support {S,U}REMEqFold before legalization This allows these optimisations to apply to e.g. `urem i16` directly before `urem` is promoted to i32 on architectures where i16 operations are not intrinsically legal (such as on Aarch64). The legalization then later can happen more directly and generated code gets a chance to avoid wasting time on computing results in types wider than necessary, in the end. Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D88785	2021-04-01 01:35:41 +03:00
Craig Topper	9e00b6660d	[SelectionDAG] Remove unneeded vector resize from the end of FoldConstantArithmetic. NFC There's an assert right before that makes sure the size already matches. Earlier in this function's life, scalars and vectors shared more code.	2021-03-31 12:33:10 -07:00
Yang Fan	0d7fd9f0d0	[GlobalISel] Fix Wint-in-bool-context warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp: In member function ‘bool llvm::CombinerHelper::matchFunnelShiftToRotate(llvm::MachineInstr&)’: /llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:3882:35: warning: ?: using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context] 3882 \| Opc == TargetOpcode::G_FSHL ? TargetOpcode::G_ROTL : TargetOpcode::G_ROTR; \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```	2021-03-31 09:59:43 +08:00
Amara Emerson	a35c2c7942	[GlobalISel] Implement fewerElements legalization for vector reductions. This patch adds 3 methods, one for power-of-2 vectors which use tree reductions using vector ops, before a final reduction op. For non-pow-2 types it generates multiple narrow reductions and combines the values with scalar ops. Differential Revision: https://reviews.llvm.org/D97163	2021-03-30 11:19:21 -07:00
Amara Emerson	91887cd4ec	[AArch64][GlobalISel] Combine funnel shifts to rotates. Differential Revision: https://reviews.llvm.org/D99388	2021-03-30 11:00:36 -07:00
Sourabh Singh Tomar	f13f050551	[DebugInfo] Support for signed constants inside DIExpression Negative numbers are represented using DW_OP_consts along with signed representation of the number as the argument. Test case IR is generated using Fortran front-end. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99273	2021-03-30 23:20:38 +05:30
Jessica Paquette	700431128e	[GlobalISel][AArch64] Combine G_SEXT_INREG + right shift -> G_SBFX Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG. This is only done post-legalization for now. Once the legalizer knows how to decompose these back into shifts, this requirement can probably be removed. Differential Revision: https://reviews.llvm.org/D99230	2021-03-30 10:14:30 -07:00
Amara Emerson	f5e9be6fdb	[GlobalISel] Implement lowering for G_ROTR and G_ROTL. This is a straightforward port. Differential Revision: https://reviews.llvm.org/D99449	2021-03-30 09:44:41 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Alok Kumar Sharma	9fb0025f70	[DebugInfo] Upgrade DISubragne::count to accept DIExpression also This is needed for Fortran assumed shape arrays whose dimensions are defined as, - 'count' is taken from array descriptor passed as parameter by caller, access from descriptor is defined by type DIExpression. - 'lowerBound' is defined by callee. The current alternate way represents using upperBound in place of count, where upperBound is calculated in callee in a temp variable using lowerBound and count Representation with count (DIExpression) is not only clearer as compared to upperBound (DIVariable) but it has another advantage that variable count is accessed by being parameter has better chance of survival at higher optimization level than upperBound being local variable. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99335	2021-03-30 09:16:55 +05:30
Rahman Lavaee	90c401cab6	[Propeller] Do not generate the BB address map for empty functions. Empty functions (functions with no real code) are irrelevant for propeller optimizations and their addresses sometimes conflict with other functions which obfuscates the analysis. This simple change skips the BB address map emission for such functions. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D99395	2021-03-29 20:15:01 -07:00
Roger Ferrer Ibanez	489ca73ac4	[PrologEpilogInserter][AMDGPU] Only adjust offset for emergency spill slots if the stack grows down D89239 adjusts the stack offset of emergency spill slots for overaligned stacks. However the adjustment is not valid for targets whose stack grows up (such as AMDGPU). This change makes the adjustment conditional only to those targets whose stack grows down. Fixes https://bugs.llvm.org/show_bug.cgi?id=49686 Differential Revision: https://reviews.llvm.org/D99504	2021-03-29 17:26:58 +00:00
Bradley Smith	9745dce8c3	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Florian Hahn	eb3d9f2eb6	[SelDag] Add isIntOrFPConstant helper function. This patch adds a new isIntOrFPConstant helper function to check if a SDValue is a integer of FP constant. This pattern is used in various places. There also are places that incorrectly just check for integer constants, e.g. D99384, so hopefully this helper will help people avoid that issue. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D99428	2021-03-28 12:48:58 +01:00
Amara Emerson	55533203d7	[GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates. Differential Revision: https://reviews.llvm.org/D99383	2021-03-25 17:23:30 -07:00
Jessica Paquette	23f657c165	[AArch64][GlobalISel] Emit bzero on Darwin Darwin platforms for both AArch64 and X86 can provide optimized `bzero()` routines. In this case, it may be preferable to use `bzero` in place of a memset of 0. This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can be generated by platforms which may want to use bzero. To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The conditions for this are largely a port of the bzero case in `AArch64SelectionDAGInfo::EmitTargetCodeForMemset`. The only difference in comparison to the SelectionDAG code is that, when compiling for minsize, this will fire for all memsets of 0. The original code notes that it's not beneficial to do this for small memsets; however, using bzero here will save a mov from wzr. For minsize, I think that it's preferable to prioritise omitting the mov. This also fixes a bug in the libcall legalization code which would delete instructions which could not be legalized. It also adds a check to make sure that we actually get a libcall name. Code size improvements (Darwin): - CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign) - CTMark -Oz: -0.2% geomean (-0.5% on bullet) Differential Revision: https://reviews.llvm.org/D99358	2021-03-25 17:14:25 -07:00
Amara Emerson	0d2c4db637	[GlobalISel] Fix crash in RBS with a non-generic IMPLICIT_DEF. This may occur when swifterror codegen in the translator generates these, but we shouldn't try to handle them since they should have regclasses anyway. rdar://75784009 Differential Revision: https://reviews.llvm.org/D99287	2021-03-24 23:08:51 -07:00
Sander de Smalen	55d18b3cc2	[TTI] Return a TypeSize from getRegisterBitWidth. This patch changes the interface to take a RegisterKind, to indicate whether the register bitwidth of a scalar register, fixed-width vector register, or scalable vector register must be returned. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98874	2021-03-24 14:45:13 +00:00
Serguei Katkov	311d81ce97	[RegAlloc] Fix "ran out of regs" with uses in statepoint Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live intervals in the way that all physical registers are occupied by "zero-length" live intervals which are marked as not-spillable. While intervals are marked as not-spillable in the moment of creation when they are really zero-length it is possible that in future as part of re-materialization there will need for physical register between def and use of such tiny interval (the use is not related to this interval at all). As all physical registers are assigned to not-spillable intervals there is not avaialbe registers and RA reports an error. The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint instruction in var args section. Such interval may be perfectly spilled and folded to operand of statepoint. Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98766	2021-03-24 10:25:34 +07:00
serge-sans-paille	e19884cd74	Introduce a generic operator to apply complex operations to BitVector This avoids temporary and memcpy call when computing large expressions. It's basically some kind of poor man's expression template, but it seems easier to maintain to have a single generic `apply` call instead of the whole expression template machinery here. Differential Revision: https://reviews.llvm.org/D98176	2021-03-23 14:23:26 +01:00
Matt Arsenault	b24436ac96	GlobalISel: Lower funnel shifts	2021-03-23 09:11:17 -04:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Pushpinder Singh	d0e5422eb8	[GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO Reviewed By: foad Differential Revision: https://reviews.llvm.org/D93963	2021-03-23 05:45:43 +00:00
Max Kazantsev	105dc0f9de	[NFC] Fix typo longre -> longer	2021-03-23 12:13:52 +07:00
Rahman Lavaee	949abf7d6a	[llvm-readelf, propeller] Add fallthrough bit to basic block metadata in BB-Address-Map section. This patch adds a fallthrough bit to basic block metadata, indicating whether the basic block can fallthrough without taking any branches. The bit will help us avoid an intel LBR bug which results in occasional duplicate entries at the beginning of the LBR stack. This patch uses `MachineBasicBlock::canFallThrough()` to set the bit. This is not a const method because it eventually calls `TargetInstrInfo::analyzeBranch`, but it calls this function with the default `AllowModify=false`. So we can either make the argument to the `getBBAddrMapMetadata` non-const, or we can use `const_cast` when calling `canFallThrough`. I decide to go with the latter since this is purely due to legacy code, and in general we should not allow the BasicBlock to be mutable during `getBBAddrMapMetadata`. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D96918	2021-03-22 21:38:05 -07:00
Sanjay Patel	664d0c052c	[TargetTransformInfo] move branch probability query from TargetLoweringInfo This is no-functional-change intended (NFC), but needed to allow optimizer passes to use the API. See D98898 for a proposed usage by SimplifyCFG. I'm simplifying the code by removing the cl::opt. That was added back with the original commit in D19488, but I don't see any evidence in regression tests that it was used. Target-specific overrides can use the usual patterns to adjust as necessary. We could also restore that cl::opt, but it was not clear to me exactly how to do it in the convoluted TTI class structure.	2021-03-22 15:55:34 -04:00
Matt Arsenault	9fdfd8dd52	GlobalISel: Add utility function to constant fold FP ops	2021-03-22 14:38:17 -04:00
Matt Arsenault	c34819afe3	GlobalISel: Handle G_BUILD_VECTOR in isKnownToBeAPowerOfTwo	2021-03-22 14:20:35 -04:00
Craig Topper	2f13e63f9e	[LegalizeDAG] Add asserts to verify the types of custom legalized operation matches the original node. We've messed this up a few times recently on RISCV. Experiments with these asserts found a couple issues on other targets as well. They've all been cleaned up now so we can put in these asserts to catch future issues I had to waive Glue because ADDC/ADDE/etc legalization replaces Glue with i32 on at least AArch64. X86 used to do the same before we switched to ADDCARRY. So I guess that's just how that works. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98979	2021-03-22 10:28:51 -07:00
Craig Topper	30080b003e	[DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization. Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No matter what answer we get back this will be true: (N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits) So we might as well save the computation. This makes the code more consistent with the similar (sext_in_reg (sext x)) handling above.	2021-03-21 11:16:41 -07:00
Matt Arsenault	20a24af01d	MIR: Fix missing serialization for HasTailCall	2021-03-21 13:14:04 -04:00
Matt Arsenault	1098acd46d	GlobalISel: Avoid unnecessary truncation to i64 We can just directly pass through the APInt to create a new constant.	2021-03-21 10:07:41 -04:00
Simon Pilgrim	64c2641c89	[DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.	2021-03-21 14:01:37 +00:00
Jessica Paquette	4773dd5ba9	[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes) There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG. E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain code that matches a bitfield extract from an and + right shift. Rather than duplicating code in the same way, this adds two opcodes: - G_UBFX (unsigned bitfield extract) - G_SBFX (signed bitfield extract) They work like this ``` %x = G_UBFX %y, %lsb, %width ``` Where `lsb` and `width` are - The least-significant bit of the extraction - The width of the extraction This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends the result, while G_SBFX sign-extends the result. This should allow us to use the combiner to match the bitfield extraction patterns rather than duplicating pattern-matching code in each target. Differential Revision: https://reviews.llvm.org/D98464	2021-03-19 14:37:19 -07:00
Simon Pilgrim	9d2df96407	[DAG] computeKnownBits - add ISD::MULHS/MULHU/SMUL_LOHI/UMUL_LOHI handling Reuse the existing KnownBits multiplication code to handle the 'extend + multiply + extract high bits' pattern for multiply-high ops. Noticed while looking at the codegen for D88785 / D98587 - the patch helps division-by-constant expansion code in particular, which suggests that we might have some further KnownBits div/rem cases we could handle - but this was far easier to implement. Differential Revision: https://reviews.llvm.org/D98857	2021-03-19 16:02:31 +00:00
Simon Pilgrim	ffb2887103	[DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),undef) -> bop(shuffle'(x,y),shuffle'(z,w)) Followup to D96345, handle unary shuffles of binops (as well as binary shuffles) if we can merge the shuffle with inner operand shuffles. Differential Revision: https://reviews.llvm.org/D98646	2021-03-19 14:14:56 +00:00
Craig Topper	182b831aeb	[DAGCombiner][RISCV] Teach visitMGATHER/MSCATTER to remove gather/scatters with all zeros masks that use SPLAT_VECTOR. Previously only all zeros BUILD_VECTOR was recognized.	2021-03-18 15:34:14 -07:00
Simon Pilgrim	1ba5c550d4	[DAG] Improve folding (sext_in_reg (*_extend_vector_inreg x)) -> (sext_vector_inreg x) Extend this to support ComputeNumSignBits of the (used) source vector elements so that we can handle more than just the case where we're sext_in_reg from the source element signbit. Noticed while investigating the poor codegen in D98587.	2021-03-18 15:34:53 +00:00
Matt Arsenault	b9a0384983	GlobalISel: Preserve source value information for outgoing byval args Pass through the original argument IR value in order to preserve the aliasing information in the memcpy memory operands.	2021-03-18 09:16:54 -04:00
Matt Arsenault	61f834cc09	GlobalISel: Insert memcpy for outgoing byval arguments byval requires an implicit copy between the caller and callee such that the callee may write into the stack area without it modifying the value in the parent. Previously, this was passing through the raw pointer value which would break if the callee wrote into it. Most of the time, this copy can be optimized out (however we don't have the optimization SelectionDAG does yet). This will trigger more fallbacks for AMDGPU now, since we don't have legalization for memcpy yet (although we should stop using byval anyway).	2021-03-18 09:16:54 -04:00
Simon Pilgrim	b1afa187c8	[DAG] SelectionDAG::isSplatValue - add ISD::ABS handling Add ISD::ABS to the existing unary instructions handling for splat detection This is similar to D83605, but doesn't appear to need to touch any of the wasm refactoring. Differential Revision: https://reviews.llvm.org/D98778	2021-03-18 10:28:29 +00:00
Amara Emerson	28963d895b	[GlobalISel] Don't DCE LIFETIME_START/LIFETIME_END markers. These are pseudos without any users, so DCE was killing them in the combiner. Marking them as having side effects doesn't seem quite right since they don't. Gives a nice 0.3% geomean size win on CTMark -Os. Differential Revision: https://reviews.llvm.org/D98811	2021-03-17 18:02:08 -07:00
Amara Emerson	d7fed7b899	[AArch64][GlobalISel] Fall back if disabling neon/fp in the translator. The previous technique relied on early-exiting the legalizer predicate initialization, leaving an empty rule table. That causes a fallback for most instructions, but some have legacy rules defined like G_ZEXT which can try continue, but then crash. We should fall back earlier, in the translator, to avoid this issue. Differential Revision: https://reviews.llvm.org/D98730	2021-03-17 15:08:08 -07:00
Stephen Tozer	3bfddc2593	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit `01ac6d1587`.	2021-03-17 16:45:25 +00:00
Hans Wennborg	01ac6d1587	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit `df69c69427`.	2021-03-17 13:36:48 +01:00
Nikita Popov	40bc309911	Revert "[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration" This reverts commit `d40b4911bd`. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=0aa637b2037d882ddf7861284169abf63f524677&to=d40b4911bd9aca0573752e065f29ddd9aff280e1&stat=instructions	2021-03-16 20:41:26 +01:00
Mircea Trofin	d40b4911bd	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-03-16 12:10:10 -07:00
serge-sans-paille	35368bbdbb	[NFC] Replace loop by idiomatic llvm::find_if	2021-03-16 12:49:19 +01:00
serge-sans-paille	6e040a19db	[NFC] Wisely nest dyn_cast in FunctionLoweringInfo Take advantage of the inheritance tree to avoid a few comparison.	2021-03-16 10:22:44 +01:00
Fangrui Song	5d44c92bf8	Change void getNoop(MCInst &NopInst) to MCInst getNop() Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.	2021-03-15 12:05:34 -07:00
Fraser Cormack	0035decae7	[CodeGen] Fix issues with scalable-vector INSERT/EXTRACT_SUBVECTORs This patch addresses a few issues when dealing with scalable-vector INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes. When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we store the low and high halves to the stack separately. The offset for the high half was calculated incorrectly. Additionally, we can optimize this process when we can detect that the subvector is contained entirely within the low/high split vector type. While this optimization is valid on scalable vectors, when performing the 'high' optimization, the subvector must also be a scalable vector. Note that the 'low' optimization is still conservative: it may be possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32, but we can't guarantee it. It is always possible to insert v2i32 into nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1. Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value type being a scalable vector, forgetting that we can also extract a fixed-length vector from a scalable one. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98495	2021-03-15 17:04:21 +00:00
Philip Reames	7d38a91a7f	Restore fixed version of "[CodeGenPrepare] Fix isIVIncrement (PR49466)" Change was reverted in commit `8d20f2c2c6` because it was causing an infinite loop. `9228f2f32` fixed the root issue in the code structure, this change just reapplies the original change w/adaptation to the new code structure.	2021-03-13 15:25:02 -08:00
Philip Reames	9228f2f322	[CGP] Consolidate logic for getIVIncrement and isIVIncrement This fixes the bug demonstrated by the test case in the commit message of `8d20f2c2` (which was a revert of `cf82700`). The root issue was that we have two transforms which are inverses of each other. We use one for simple induction variables (where we can use the post-inc form), and the other for everything else. The problem was that the two transforms could disagree about whether something was an induction variable. The reverted commit made a change to one of the matcher routines which was used for one of the two transforms without updating the other matcher. However, it's worth noting the existing code w/o the reverted change also has cases where the decision could differ between the two paths. The fix is simply to consolidate the code such that two paths must agree by construction, and to add an assert to catch any potential future re-divergence. Triggering the infinite loop requires side stepping the SunkAddrs cache. The SunkAddrs cache has the effect of suppressing the iteration in the common case, but there are codepaths through CGP which restart iteration and clear this cache. Unfortunately, I have not been able to construct a standalone IR test case for this. The original test case is a c++ program which when compiled by clang demonstrates the infinite loop, but all of my attempts at extracting an IR test case runnable through opt/llc have failed to reproduce. (Including capturing the IR at point of the transform itself!) I have no idea what weird state clang is creating here. I also tried creating a test case by hand, but gave up after about an hour of trying to find the right combination to dance through multiple transforms to create the end result needed to trip the bug.	2021-03-13 14:55:25 -08:00
Craig Topper	5b825433d7	[DAGCombiner] Optimize 1-bit smulo to AND+SETNE. A 1-bit smulo overflows is both inputs are -1 since the result should be +1 which can't be represented in a signed 1 bit value. We can detect this with an AND and a setcc. The multiply result can also use the same AND. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97634	2021-03-13 09:39:36 -08:00
Jordan Rupprecht	8d20f2c2c6	Revert "[CodeGenPrepare] Fix isIVIncrement (PR49466)" This reverts commit `cf82700af8` due to a compile timeout when building the following with `clang -O2`: ``` template <class, class = int> class a; struct b { using d = int ; }; struct e { using f = b::d; }; class g { public: e::f h; e::f i; }; template <class, class> class a : g { public: long j() const { return i - h; } long operator[](long) const noexcept; }; template <class c, class k> long a<c, k>::operator[](long l) const noexcept { return h[l]; } template <typename m, typename n> int fn1(m, n, const char ); int o, p; class D { void q(const a<long> &); long r; }; void D::q(const a<long> &l) { int s; if (l[0]) for (; l.j(); ++s) { if (l[s]) while (fn1(o, 0, "")) ; r = l[s] / p; } } ```	2021-03-12 13:59:14 -08:00
Craig Topper	2ea7014089	[DAGCombiner] Use isConstantSplatVectorAllZeros/Ones instead of isBuildVectorAllZeros/Ones in visitMSTORE and visitMLOAD. This allows us to optimize when the mask is a splat_vector in addition to build_vector.	2021-03-12 12:14:56 -08:00
Nikita Popov	42eb658f65	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) This removes some (but not all) uses of type-less CreateGEP() and CreateInBoundsGEP() APIs, which are incompatible with opaque pointers. There are a still a number of tricky uses left, as well as many more variation APIs for CreateGEP.	2021-03-12 21:01:16 +01:00
Matt Arsenault	6b76d82853	GlobalISel: Fix marking byval arguments as immutable byval arguments need to be assumed writable. Only implicitly stack passed arguments which aren't addressable in the IR can be assumed immutable. Mips is still broken since for some reason its doing its own thing with the ValueHandlers (and x86 doesn't actually handle byval arguments now, although some of the code is there).	2021-03-12 09:01:53 -05:00
Matt Arsenault	34471c3060	GlobalISel: Partially fix handling of byval arguments This was essentially ignoring byval and treating them as a pointer argument which needed to be loaded from. This should copy the frame index value to the virtual register, not insert a load from the frame index into the pointer value. For AMDGPU, this was producing a load from the byval pointer argument, to a pointer used for the byval arguments. I do not understand how AArch64 managed to work before since it appears to be similarly broken. We could also change the ValueHandler API to avoid the extra copy from the frame index, since currently it returns a new register. I believe there is still an issue with outgoing byval arguments. These should have a copy inserted in case the callee decided to overwrite the memory.	2021-03-12 09:01:53 -05:00
LemonBoy	cfe69c8efd	[SelectionDAG] Improve scalarization of irregular vector types Use a more general strategy when splitting a vector into scalar parts (and vice-versa) to correctly handle vector types whose element size is not a power of 2 (and a multiple of 8). Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D98273	2021-03-11 19:57:13 +01:00
David Green	fad70c3068	[ARM] Improve WLS lowering Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over to the WLS while loops, improving the chance of lowering them successfully. To do this the lowering has to change a little as the instructions are terminators that produce a value - something that needs to be treated carefully. Lowering starts at the Hardware Loop pass, inserting a new llvm.test.start.loop.iterations that produces both an i1 to control the loop entry and an i32 similar to the llvm.start.loop.iterations intrinsic added for do loops. This feeds into the loop phi, properly gluing the values together: %wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div) %wls0 = extractvalue { i32, i1 } %wls, 0 %wls1 = extractvalue { i32, i1 } %wls, 1 br i1 %wls1, label %loop.ph, label %loop.exit ... loop: %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ] .. %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1) %cmp = icmp ne i32 %iv.next, 0 br i1 %cmp, label %loop, label %loop.exit The llvm.test.start.loop.iterations need to be lowered through ISel lowering as a pair of WLS and WLSSETUP nodes, which each get converted to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent t2WhileLoopStart from being a terminator that produces a value, something difficult to control at that stage in the pipeline. Instead the t2WhileLoopSetup produces the value of LR (essentially acting as a lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc). These are then converted into a single t2WhileLoopStartLR at the same point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop to prevent them from progressing further in the pipeline. The t2WhileLoopStartLR is a single instruction that takes a GPR and produces LR, similar to the WLS instruction. %1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3 t2B %bb.1 ... bb.2.loop: %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2 ... %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2 t2B %bb.3 The t2WhileLoopStartLR can then be treated similar to the other low overhead loop pseudos, eventually being lowered to a WLS providing the branches are within range. Differential Revision: https://reviews.llvm.org/D97729	2021-03-11 17:56:19 +00:00
Craig Topper	e9426dfbae	[ValueTypes][RISCV] Add MVT for v1f16. RISCV makes all fixed vector MVTs with size less than or equal to a command line option legal. This didn't include v1f16 because it was missing but did include v1f32 and v1f64. One test is affected where we did test this type, but it is a horizontal reduction so it is non-sensical. Perhaps we should canonicalize that away somewhere. I'm not sure if we should be making v1 types legal, but this will at least make RISCV consistent across all types. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98365	2021-03-11 09:23:18 -08:00
Matt Arsenault	cf5ecd5644	GlobalISel: Fix off by one in finding explicit byval alignment For attribute sets, the return index is at 0, and arguments start at 1. getParamAlignment adds the offset of 1, so we need to convert from attribute index back to IR index.	2021-03-11 10:23:08 -05:00
Stephen Tozer	f40976bd01	Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reverts commit `c0f3dfb9f1`. Reverted due to an error on the clang-x64-windows-msvc buildbot.	2021-03-11 14:48:01 +00:00
gbtozers	c0f3dfb9f1	[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions. These salvages produce DIArgList locations and are only valid for dbg.values, as currently variadic DIExpressions must use DW_OP_stack_value. This functionality is also only added for salvageDebugInfoForDbgValues; other functions that directly call salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be updated in a later patch. Differential Revision: https://reviews.llvm.org/D91722	2021-03-11 13:33:49 +00:00
Serguei Katkov	0480927712	[Statepoint Lowering] Handle the case with several gc.result Recently gc.result has been marked with readnone instead of readonly and this opens a door for different optimization to duplicate gc.result. Statepoint lowering is not ready to see several gc.results. The problem appears when there are gc.results with one located in the same basic block and another located in other basic block. In this case we need both export VR and fill local setValue. Note that this case is not sufficient optimization done before CodeGen. It is evident that local gc.result dominates all other gc.results and it is handled by GVN and EarlyCSE. But anyway, even if IR is not optimal Backend should not crash on a valid IR. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98393	2021-03-11 18:44:44 +07:00
David Blaikie	80d1f657a1	Fix unused lambda capture in a non-asserts build For locally scoped lambdas like this there's no particular benefit to explicitly listing captures - or avoiding capturing this. Switch to [&] and make it all easier to maintain. (& driveby change std::function to llvm::function_ref)	2021-03-11 00:22:18 -08:00
Daniel Sanders	134a179dee	[mir] Change 'undef' for MMO base addresses to 'unknown-address' Differential Revision: https://reviews.llvm.org/D98100	2021-03-10 16:46:44 -08:00
Quentin Colombet	66dab2fa84	[NFC] Fix compiler warnings Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298	2021-03-10 11:03:50 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Stephen Tozer	1db137b185	[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578	2021-03-10 17:15:24 +00:00
Stephen Tozer	e64f3ccca3	Reapply "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST" This reverts commit `429c6ecbb3`.	2021-03-10 15:59:24 +00:00
Stephen Tozer	429c6ecbb3	Revert "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST" This reverts commit `0da27ba56c`. This revision was causing an error on the sanitizer-x86_64-linux-autoconf build.	2021-03-10 14:35:33 +00:00
gbtozers	0da27ba56c	[DebugInfo] Add DWARF emission for DBG_VALUE_LIST This patch allows DBG_VALUE_LIST instructions to be emitted to DWARF with valid DW_AT_locations. This change mainly affects DbgEntityHistoryCalculator, which now tracks multiple registers per value, and DwarfDebug+DwarfExpression, which can now emit multiple machine locations as part of a DWARF expression. Differential Revision: https://reviews.llvm.org/D83495	2021-03-10 13:46:20 +00:00
Christudasan Devadasan	4c6ab48fb1	GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM It is good to have a combined `divrem` instruction when the `div` and `rem` are computed from identical input operands. Some targets can lower them through a single expansion that computes both division and remainder. It effectively reduces the number of instructions than individually expanding them. Reviewed By: arsenm, paquette Differential Revision: https://reviews.llvm.org/D96013	2021-03-10 18:46:07 +05:30
Jinzheng Tu	481079e284	[NFC] Unify FIME with FIXME in comments There are 5 occurrences FIME and 15333 FIXME. All of them should be FIXME. Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D98321	2021-03-10 14:00:51 +01:00
Serguei Katkov	2fccd1b00a	[Statepoint Lowering] Fix the crash with gc.relocate in a separate block If it was decided to relocate derived pointer using the spill its value is not exported in general case. When gc.relocate is located in an another block than a statepoint we cannot get SD for derived value but for spill case it is not required at all. However implementation of gc.relocate lowering unconditionally request SD value causing the assert triggering. The CL fixes this by handling spill case earlier than SD is really required. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98324	2021-03-10 19:51:04 +07:00
gbtozers	7d0cafba96	[DebugInfo] Process DBG_VALUE_LIST in LiveDebugVariables This patch adds support for DBG_VALUE_LIST in the LiveDebugVariables pass. The changes are mostly in computeIntervals, extendDef, and addDefsFromCopies; when extending the def of a DBG_VALUE_LIST the live ranges of every used register must be considered, and when such a def is killed by more than one of its used registers being killed at the same time it is necessary to find valid copies of all of those registers to create a new def with. The DebugVariableValue class has also been changed to reference multiple location numbers instead of just one. This has been accomplished by using a C-style array with a unique_ptr and an array length packed into 6 bits, to minimize the size of the class (which must be kept low to be used with IntervalMap). This may not be the most efficient solution possible, and should be looked at if performance issues arise. Differential Revision: https://reviews.llvm.org/D83895	2021-03-10 12:37:59 +00:00
Philip Reames	d6394d86ca	[cgp] improve robustness of uadd/usub transforms LSR prefers to schedule iv increments just before the latch. The recent `80511565` broadened this to moving increments in the original IR. This pointed out a robustness problem with the CGP transform. When we have a use of an induction increment outside of the loop (we canonicalize away from this form, but it happens e.g. unanalyzeable loops) we'd avoid performing the uadd/usub transform. Interestingly, all of these involve moving the increment closer to it's operands, so there's no concern about dominating all uses. We can handle that case cheaply, resulting in a more robust transform.	2021-03-09 11:52:08 -08:00
Amara Emerson	55e760769b	[GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted. If every element is extracted from a G_BUILD_VECTOR, pass through the source registers. This is different to the extract(build_vector) combine because this one tolerates multiple users as long as they're exhaustive. Differential Revision: https://reviews.llvm.org/D97890	2021-03-09 11:34:26 -08:00

1 2 3 4 5 ...

30512 Commits