llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	edfdce2627	[PHIElimination] Fix accounting for undef uses when updating LiveVariables PHI elimination updates LiveVariables info as described here: // We only need to update the LiveVariables kill of SrcReg if this was the // last PHI use of SrcReg to be lowered on this CFG edge and it is not live // out of the predecessor. We can also ignore undef sources. Unfortunately if the last use also happened to be an undef use then it would fail to update the LiveVariables at all. Fix this by not counting undef uses in the VRegPHIUse map. Thanks to Mikael Holmén for the test case! Differential Revision: https://reviews.llvm.org/D111552	2021-10-11 20:22:47 +01:00
Amara Emerson	f95d9c95bb	[GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly dividing type sizes. If the wide store we'd generate is not a multiple of the memory type of the narrow stores (e.g. s48 and s32), we'd assert. Fix that.	2021-10-09 21:18:20 -07:00
Dávid Bolvanský	943b304848	Fixed some errors detected by PVS Studio	2021-10-09 17:27:41 +02:00
Dávid Bolvanský	3649fb14d1	Fixed some errors detected by PVS Studio	2021-10-09 17:20:04 +02:00
Arthur Eubanks	a0a4935182	Make more places that use alignment use uint64_t Followup to D110451.	2021-10-08 16:35:19 -07:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Amara Emerson	17b89f9daa	[GlobalISel] Improve G_UMHULH -> LSHR combine to accept non-uniform constant vectors.	2021-10-08 11:25:26 -07:00
Craig Topper	a9700653ab	[RegisterScavenging] Use a Twine in a call to report_fatal_error instead of going from std::string to c_str. NFC The std::string was built on the line above. Might as well just build it as a Twine in the call.	2021-10-08 11:04:08 -07:00
Bradley Smith	7c68d4b8ff	Revert "[SelectionDAG] Remove PromoteIntOp_EXTRACT_SUBVECTOR." This reverts commit `3e8d2008f7`. The code removed in this commit is actually required for extracting fixed types from illegal scalable types, hence this commit causes assertion failures in such extracts.	2021-10-08 14:53:26 +00:00
Mirko Brkusanin	d20840c937	[GlobalISel] Combine for eliminating redundant operand negations Differential Revision: https://reviews.llvm.org/D111319	2021-10-08 14:29:22 +02:00
Amara Emerson	72ce310bf0	[GlobalISel][IRTranslator] Fix a use-after-free bug when translating trap-func-name traps. This was using MachineFunction::createExternalSymbolName() before, which seems reasonable, but in fact this is freed before the asm emitter which tries to access the function name string. Switching it to use the string returned by the attribute seems to fix the problem.	2021-10-07 23:51:37 -07:00
Amara Emerson	08b3c0d995	[GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c) In order to not generate an unnecessary G_CTLZ, I extended the constant folder in the CSEMIRBuilder to handle G_CTLZ. I also added some extra handing of vector constants too. It seems we don't have any support for doing constant folding of vector constants, so the tests show some other useless G_SUB instructions too. Differential Revision: https://reviews.llvm.org/D111036	2021-10-07 23:51:37 -07:00
Jay Foad	b84d9d299e	[TargetPassConfig] Remove an obsolete FIXME comment The "coloring with register" functionality it refers to was removed ten years ago in svn r144481 "Remove the -color-ss-with-regs option".	2021-10-08 07:34:25 +01:00
Itay Bookstein	faa0e2ae76	[SelectionDAG] Fix shift libcall ABI mismatch in shift-amount argument The shift libcalls have a shift amount parameter of MVT::i32, but sometimes ExpandIntRes_Shift may be called with a node whose second operand is a type that is larger than that. This leads to an ABI mismatch, and for example causes a spurious zeroing of a register in RV32 for 64-bit shifts. Note that at present regular shift intstructions already have their shift amount operand adapted at SelectionDAGBuilder::visitShift time, and funnelled shifts bypass that. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110508	2021-10-08 09:57:57 +08:00
Wang, Pengfei	c236883b6b	[X86] Optimize fdiv with reciprocal instructions for half type Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110557	2021-10-08 09:41:13 +08:00
Adrian Prantl	9f93f2bfbd	Do not emit prologue_end for line 0 locs if there is a non-zero loc present This change fixes a bug where the compiler generates a prologue_end for line 0 locs. That is because line 0 is not associated with any source location, so there should not be a prolgoue_end at a location that doesn't correspond to a source location. There were some LLVM tests that were explicitly checking for line 0 prologue_end's as well since I believe that to be incorrect, I had to change those tests as well. Patch by Shubham Rastogi! Differential Revision: https://reviews.llvm.org/D110740	2021-10-07 13:54:28 -07:00
Jay Foad	097339b1ca	[TargetPassConfig] Enable machine verification after miscellaneous passes In a couple of places machine verification was disabled for no apparent reason, probably just because an "addPass(..., false)" line was cut and pasted from elsewhere. After this patch the only remaining place where machine verification is disabled in the generic TargetPassConfig code, is after addPreEmitPass.	2021-10-07 21:24:50 +01:00
Jay Foad	27c57e791a	[TwoAddressInstruction] Enable machine verification after this pass Differential Revision: https://reviews.llvm.org/D111007	2021-10-07 20:04:51 +01:00
Jay Foad	3ff0a5747d	[PHIElimination] Enable machine verification after this pass Differential Revision: https://reviews.llvm.org/D111006	2021-10-07 20:04:51 +01:00
Jay Foad	3c9dfba189	[PHIElimination] Account for INLINEASM_BR when inserting kills When PHIElimination adds kills after lowering PHIs to COPYs it knows that some instructions after the inserted COPY might use the same SrcReg, but it was only looking at the terminator instructions at the end of the block, not at other instructions like INLINEASM_BR that can appear after the COPY insertion point. Since we have already called findPHICopyInsertPoint, which knows about INLINEASM_BR, we might as well reuse the insertion point that it calculated when looking for instructions that might use SrcReg. This fixes a machine verification failure if you force machine verification to run after PHIElimination (currently it is disabled for other reasons) when running test/CodeGen/X86/callbr-asm-phi-placement.ll. Differential Revision: https://reviews.llvm.org/D110834	2021-10-07 20:04:39 +01:00
Amara Emerson	8bfc0e06dc	[GlobalISel] Port the udiv -> mul by constant combine. This is a straight port from the equivalent DAG combine. Differential Revision: https://reviews.llvm.org/D110890	2021-10-07 11:37:17 -07:00
Jay Foad	548b01c7a6	[MIRParser] Add support for IsInlineAsmBrIndirectTarget Print this basic block flag as inlineasm-br-indirect-target and parse it. This allows you to write MIR test cases for INLINEASM_BR. The test case I added is one that I wanted to precommit anyway for D110834. Differential Revision: https://reviews.llvm.org/D111291	2021-10-07 19:08:01 +01:00
Bradley Smith	5be266db7a	[AArch64][SVE] Improve VECTOR_SPLICE codegen for VL > 128-bit Differential Revision: https://reviews.llvm.org/D111135	2021-10-07 15:28:55 +00:00
Jack Andersen	bd4dad87f4	[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand Based on the reasoning of D53903, register operands of DBG_VALUE are invariably treated as RegState::Debug operands. This change enforces this invariant as part of MachineInstr::addOperand so that all passes emit this flag consistently. RegState::Debug is inconsistently set on DBG_VALUE registers throughout LLVM. This runs the risk of a filtering iterator like MachineRegisterInfo::reg_nodbg_iterator to process these operands erroneously when not parsed from MIR sources. This issue was observed in the development of the llvm-mos fork which adds a backend that relies on physical register operands much more than existing targets. Physical RegUnit 0 has the same numeric encoding as $noreg (indicating an undef for DBG_VALUE). Allowing debug operands into the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision of register numbers with different zero semantics). Eventually, this causes an assert where DBG_VALUE instructions are prohibited from participating in live register ranges. Reviewed By: MatzeB, StephenTozer Differential Revision: https://reviews.llvm.org/D110105	2021-10-07 16:08:52 +01:00
Carl Ritson	b5d6ad20e1	[MachineCopyPropagation] Handle propagation of undef copies When propagating undefined copies the undef flag must also be propagated. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D111219	2021-10-07 20:34:27 +09:00
Jay Foad	df2d4bc4cb	[TwoAddressInstruction] Fix ReplacedAllUntiedUses in processTiedPairs Fix the calculation of ReplacedAllUntiedUses when any of the tied defs are early-clobber. The effect of this is to fix the placement of kill flags on an instruction like this (from @f2 in test/CodeGen/SystemZ/asm-18.ll): INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], killed %4:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit After TwoAddressInstruction without this patch: %3:grh32bit = COPY killed %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Note that the COPY kills %4, even though there is a later use of %4 in the INLINEASM. This fails machine verification if you force it to run after TwoAddressInstruction (currently it is disabled for other reasons). After TwoAddressInstruction with this patch: %3:grh32bit = COPY %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Differential Revision: https://reviews.llvm.org/D110848	2021-10-07 10:10:11 +01:00
Mikael Holmen	9bf5d91361	[GlobalISel] Silence gcc warning about unused variable	2021-10-07 07:18:04 +02:00
Itay Bookstein	40ec1c0f16	[IR][NFC] Rename getBaseObject to getAliaseeObject To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction (D109792), the function is renamed to getAliaseeObject.	2021-10-06 19:33:10 -07:00
David Blaikie	f6a561c4d6	DebugInfo: Use clang's preferred names for integer types This reverts `c7f16ab3e3` / r109694 - which suggested this was done to improve consistency with the gdb test suite. Possible that at the time GCC did not canonicalize integer types, and so matching types was important for cross-compiler validity, or that it was only a case of over-constrained test cases that printed out/tested the exact names of integer types. In any case neither issue seems to exist today based on my limited testing - both gdb and lldb canonicalize integer types (in a way that happens to match Clang's preferred naming, incidentally) and so never print the original text name produced in the DWARF by GCC or Clang. This canonicalization appears to be in `integer_types_same_name_p` for GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb. (I tested this with one translation unit defining 3 variables - `long`, `long ()()`, and `int ()()`, and another translation unit that had main, and a function that took `long ()()` as a parameter - then compiled them with mismatched compilers (either GCC+Clang, or Clang+(Clang with this patch applied)) and no matter the combination, despite the debug info for one CU naming the type "long int" and the other naming it "long", both debuggers printed out the name as "long" and were able to correctly perform overload resolution and pass the `long int ()()` variable to the `long (*)()` function parameter) Did find one hiccup, identified by the lldb test suite - that CodeView was relying on these names to map them to builtin types in that format. So added some handling for that in LLVM. (these could be split out into separate patches, but seems small enough to not warrant it - will do that if there ends up needing any reverti/revisiting) Differential Revision: https://reviews.llvm.org/D110455	2021-10-06 16:02:34 -07:00
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit `8d64314ffe`.	2021-10-06 11:38:11 -07:00
Arthur Eubanks	8d64314ffe	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 11:03:51 -07:00
Arthur Eubanks	72cf8b6044	Revert "[IR] Increase max alignment to 4GB" This reverts commit `df84c1fe78`. Breaks some bots	2021-10-06 10:21:35 -07:00
Arthur Eubanks	df84c1fe78	[IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 09:54:14 -07:00
Amara Emerson	79d13bf22c	Revert "Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable""" This reverts commit `d95cd81141`. Re-land the original patch now that the bug this exposed in selection has been fixed by `6bc64e24c3`	2021-10-06 04:16:19 -07:00
Simon Pilgrim	21661607ca	[llvm] Replace report_fatal_error(std::string) uses with report_fatal_error(Twine) As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.	2021-10-06 12:04:30 +01:00
David Sherwood	37edb7d3e2	[SVE] Fix incorrect DAG combines when extracting fixed-width from scalable vectors We were previously silently generating incorrect code when extracting a fixed-width vector from a scalable vector. This is worse than crashing, since the user will have no indication that this is currently unsupported behaviour. I have fixed the code to only perform DAG combines when safe to do so, i.e. the input and output vectors are both fixed-width or both scalable. Test added here: CodeGen/AArch64/sve-extract-scalable-vector.ll Differential revision: https://reviews.llvm.org/D110624	2021-10-06 09:27:44 +01:00
Amara Emerson	6bc64e24c3	[GlobalISel] Clear unreachable blocks' contents after selection. If these blocks are unreachable, then we can discard all of the instructions. However, keep the block around because it may have an address taken or the block may have a stale reference from a PHI somewhere. Instead of finding those PHIs and fixing them up, just leave the block empty. Differential Revision: https://reviews.llvm.org/D111201	2021-10-05 23:06:22 -07:00
Simon Pilgrim	2e5daac217	[llvm] Update report_fatal_error calls from raw_string_ostream to use Twine(OS.str()) As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared. We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().	2021-10-05 18:42:12 +01:00
Joe Nash	8f55fdf26c	[MacroFusion] Expose useful static methods. NFC. hasLessThanNumFused and fuseInstructionPair are useful for DAG mutations similar to MacroFusion, but which cannot use MacroFusion as a whole (such as fusing non-dependent instruction). Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D111070 Change-Id: I3a5d56aba0471d45ef64cebb9b724030e2eae2f3	2021-10-05 11:51:48 -04:00
Amara Emerson	de5b16d8ca	Revert "Revert "Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable"""" This reverts commit `c93bc508ee`. Seems to break a different thing now.	2021-10-05 08:25:13 -07:00
Jay Foad	f65458df32	[PHIElimination] Update LiveVariables after handling an unspillable terminator Update the LiveVariables analysis after the special handling for unspillable terminators which was added in D91358. This is just enough to fix some "Block should not be in AliveBlocks" / "Block missing from AliveBlocks" errors in the codegen test suite when machine verification is forced to run after PHIElimination (currently it is disabled). Differential Revision: https://reviews.llvm.org/D110939	2021-10-05 14:25:53 +01:00
Jeremy Morse	e265644b32	[DebugInfo][InstrRef] Track all of DBG_PHIs operands An important part of the instruction referencing solution is that we identify all the registers that values move between before we then compute an SSA-like function from the machine code, and from the variable intrinsics. DBG_PHIs weren't causing all the subregisters of their operands to be tracked; this patch forces that to happen. The practical implications were that not enough space is allocated for storing values when analysing the function -- asan will crash on the attached test case with an unpatched compiler. Non-asan llc's will produce a DBG_VALUE $noreg, where it should be $dil. Differential Revision: https://reviews.llvm.org/D109064	2021-10-05 14:01:26 +01:00
Mirko Brkusanin	40e00063bc	[GlobalISel] Combine fabs(fneg(x)) to fabs(x) Differential Revision: https://reviews.llvm.org/D110943	2021-10-05 13:43:39 +02:00
Bjorn Pettersson	8ed0e6b2cf	[SelectionDAG] Replace error prone index check in BaseIndexOffset::computeAliasing Deriving NoAlias based on having the same index in two BaseIndexOffset expressions seemed weird (and as shown in the added unittest the correctness of doing so depended on undocumented pre-conditions that the user of BaseIndexOffset::computeAliasing would need to take care of. This patch removes the code that dereived NoAlias based on indices being the same. As a compensation, to avoid regressions/diffs in various lit test, we also add a new check. The new check derives NoAlias in case the two base pointers are based on two different GlobalValue:s (neither of them being a GlobalAlias). Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D110256	2021-10-05 12:15:55 +02:00
Bjorn Pettersson	1896fb2cff	[SelectionDAG] Assume that a GlobalAlias may alias other global values This fixes a bug detected in DAGCombiner when using global alias variables. Here is an example: @foo = global i16 0, align 1 @aliasFoo = alias i16, i16 * @foo define i16 @bar() { ... store i16 7, i16 * @foo, align 1 store i16 8, i16 * @aliasFoo, align 1 ... } BaseIndexOffset::computeAliasing would incorrectly derive NoAlias for the two accesses in the example above, resulting in DAGCombiner miscompiles. This patch fixes the problem by a defensive approach letting BaseIndexOffset::computeAliasing return false, i.e. that the aliasing couldn't be determined, when comparing two global values and at least one is a GlobalAlias. In the future we might improve this with a deeper analysis to look at the aliasee for the GlobalAlias etc. But that is a bit more complicated considering that we could have 'local_unnamed_addr' and situations with several 'alias' variables. Fixes PR51878. Differential Revision: https://reviews.llvm.org/D110064	2021-10-05 12:15:55 +02:00
Jay Foad	0a031f5c88	[GlobalISel] Simplify narrowScalarMul. NFC. Remove some redundancy because the source and result types of any multiply are always the same.	2021-10-05 10:53:12 +01:00
Jay Foad	0bd4365445	[LiveIntervals] Fix verification of early-clobbered segments Enable verification of live intervals immediately after computing them (when -early-live-intervals is used) and fix a problem that that provokes: currently the verifier insists that a segment that ends at an early-clobber slot must be followed by another segment starting at the same slot. But before TwoAddressInstruction runs, the equivalent condition is: a segment that ends at an early-clobber slot must have its last use tied to an early-clobber def. That condition is harder to check here, so for now just disable this check until tied operands have been rewritten. Differential Revision: https://reviews.llvm.org/D111065	2021-10-05 08:17:56 +01:00
Amara Emerson	cfef1803dd	[GlobalISel] Port over the SelectionDAG stack protector codegen feature. This is a port of the feature that allows the StackProtector pass to omit checking code for stack canary checks, and rely on SelectionDAG to do it at a later stage. The reasoning behind this seems to be to prevent the IR checking instructions from hindering tail-call optimizations during codegen. Here we allow GlobalISel to also use that scheme. Doing so requires that we do some analysis using some factored-out code to determine where to generate code for the epilogs. Not every case is handled in this patch since we don't have support for all targets that exercise different stack protector schemes. Differential Revision: https://reviews.llvm.org/D98200	2021-10-04 21:33:44 -07:00
Amara Emerson	c93bc508ee	Revert "Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable""" This reverts commit `d95cd81141`. The selector sometimes leaves unreachable blocks unselected because it uses a postorder traversal for the block ordering. With the trap intrinsics now being emitted, these blocks are no longer empty and the unselected G_INTRINSIC instructions survive past selection. To fix this, keep track of which blocks are selected and later delete any blocks that weren't selected.	2021-10-04 18:10:28 -07:00
Amara Emerson	d95cd81141	Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable"" This reverts commit `019041bec3`. It broke some bots.	2021-10-04 15:44:52 -07:00
Jeremy Morse	e2b838dd91	[DebugInfo][InstrRef] Accept landingpad block arguments This patch makes instruction-referencing accepts an additional scenario where values can be read from physical registers at the start of blocks. As far as I was aware, this only happened: * With arguments in the entry block, * With constant physical registers, To which this patch adds a third case: * With exception-handling landing-pad blocks In the attached test: the operand of the dbg.value traces back to the "landingpad" instruction, which becomes some copies from physregs. Right now, that's deemed unacceptable, and the assertion fires. The fix is to just accept this scenario; this is a case where the value in question is defined by a register and a position, not by an instruction that defines it. Reading it with a DBG_PHI is the correct behaviour, there isn't a non-copy instruction that we can refer to. Differential Revision: https://reviews.llvm.org/D109005	2021-10-04 23:03:02 +01:00
Amara Emerson	8bde5e58c0	Delay outgoing register assignments to last. The delayed stack protector feature which is currently used for SDAG (and thus allows for more commonly generating tail calls) depends on being able to extract the tail call into a separate return block. To do this it also has to extract the vreg->physreg copies that set up the call's arguments, since if it doesn't then the call inst ends up using undefined physregs in it's new spliced block. SelectionDAG implementations can do this because they delay emitting register copies until after the stack arguments are set up. GISel however just processes and emits the arguments in IR order, so stack arguments always end up last, and thus this breaks the code that looks for any register arg copies that precede the call instruction. This patch adds a thunk argument to the assignValueToReg() and custom assignment hooks. For outgoing arguments, register assignments use this return param to return a thunk that does the actual generating of the copies. We collect these until all the outgoing stack assignments have been done and then execute them, so that the copies (and perhaps some artifacts like G_SEXTs) are placed after any stores. Differential Revision: https://reviews.llvm.org/D110610	2021-10-04 12:33:20 -07:00
Jay Foad	24688f8fdf	Revert "[GlobalISel] Support vectors in LegalizerHelper::narrowScalarMul" This reverts commit `90da0b9a5a`. It was causing an LLVM_ENABLE_EXPENSIVE_CHECKS buildbot failure.	2021-10-04 20:26:30 +01:00
Amara Emerson	dafcbfdaa0	[GlobalISel] Widen G_EXTRACT_VECTOR_ELT using anyext instead of sext. G_SEXT seems to be unnecessary here, anyext will do. Differential Revision: https://reviews.llvm.org/D110469	2021-10-04 12:19:19 -07:00
Jay Foad	90da0b9a5a	[GlobalISel] Support vectors in LegalizerHelper::narrowScalarMul Also remove some redundancy because the source and result types of any multiply are always the same. Differential Revision: https://reviews.llvm.org/D110926	2021-10-04 19:33:38 +01:00
Amara Emerson	019041bec3	[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable" We were previously just ignoring unreachable, but targets like Darwin want to keep unreachable instructions as traps. Differential Revision: https://reviews.llvm.org/D110603	2021-10-04 11:02:29 -07:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Kazu Hirata	d34cd75d89	[Analysis, CodeGen] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-03 08:22:20 -07:00
Takafumi Arakaki	e8806d7486	Re-apply the fix on DwarfEHPrepare and add a test This patch re-introduces the fix in the commit https://github.com/llvm/llvm-project/commit/66b0cebf7f736 by @yrnkrn > In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling > > pointer to a dead function. To make sure it's valid, doFinalization nullptrs > RewindFunction just like the constructor and so it will be found on next run. > > llvm-svn: 217737 It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`. This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with ``` -- Testing: 1 tests, 1 workers -- FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1) ****************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ****************** Script: -- : 'RUN: at line 1'; /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S \| /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- Exit Code: 2 Command Output (stderr): -- Referencing function in another module! call void @_Unwind_Resume(i8* %ehptr) #1 ; ModuleID = '<stdin>' void (i8) @_Unwind_Resume ; ModuleID = '<stdin>' in function simple_cleanup_catch LLVM ERROR: Broken function found, compilation aborted! PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Module Verifier' on function '@simple_cleanup_catch' #0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0 #1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0 #2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0 #3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980) #4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0 #5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0 #6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0 #7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0 #8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528) #9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0 FileCheck error: '<stdin>' is empty. FileCheck command line: /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- ****************** ****************** Failed Tests (1): LLVM :: CodeGen/X86/dwarf-eh-prepare.ll Testing Time: 0.22s Failed: 1 ``` Reviewed By: loladiro Differential Revision: https://reviews.llvm.org/D110979	2021-10-02 21:50:35 -04:00
Simon Pilgrim	df672f66b6	[DAG] scalarizeExtractedVectorLoad - replace getABITypeAlign with allowsMemoryAccess (PR45116) One of the cases identified in PR45116 - we don't need to limit extracted loads to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory loads by checking allowsMisalignedMemoryAccesses as a fallback. I've also cleaned up the alignment calculation code - if we have a constant extraction index then the alignment can be based on an offset from the original vector load alignment, but for non-constant indices we should assume the worst (single element alignment only). Differential Revision: https://reviews.llvm.org/D110486	2021-10-01 21:07:34 +01:00
Jay Foad	dff3454bda	[TwoAddressInstruction] Tweak constraining of tied operands In collectTiedOperands, when handling an undef use that is tied to a def, constrain the dst reg with the actual register class of the src reg, instead of with the register class from the instructions's MCInstrDesc. This makes a difference in some AMDGPU test cases like this, before: %16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1 After, without this patch: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec This fails machine verification if you force it to run after TwoAddressInstruction (currently it is disabled) with: * Bad machine code: Invalid register class for subregister index * - function: s_load_constant_v3i32_align4 - basic block: %bb.0 (0xa011a88) - instruction: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec - operand 0: undef %16.sub0_sub1:sgpr_96 Register class SGPR_96 does not fully support subreg index 4 After, with this patch: undef %16.sub0_sub1:sgpr_96_with_sub0_sub1 = COPY killed %11:sreg_64_xexec See also svn r159120 which introduced the code to handle tied undef uses. Differential Revision: https://reviews.llvm.org/D110944	2021-10-01 20:57:58 +01:00
Jay Foad	31c92d515d	[MachineLoopInfo] Enable machine verification after this pass Enabling this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110703	2021-10-01 18:15:57 +01:00
Jay Foad	04787239c9	[LiveVariables] Skip verification of kills inside bundles LiveVariables does not examine the contents of bundles, so MachineVerifier should not expect it to know about kill flags on operands of instructions inside a bundle. With this fix we can enable machine verification after running the LiveVariables analysis. Doing this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110700	2021-10-01 18:15:57 +01:00
Jay Foad	08d41f75d9	[UnreachableMachineBlockElim] Enable machine verification after this pass Enabling this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110697	2021-10-01 18:15:57 +01:00
Jay Foad	2bfe777a45	[ProcessImplicitDefs] Enable machine verification after this pass Enabling this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110695	2021-10-01 18:15:56 +01:00
Jay Foad	fd8e99700d	[DetectDeadLanes] Enable machine verification after this pass Machine verification after DetectDeadLanes has been disabled since the pass was first added in D18427, but I guess this was just due to copy- and-paste. Enabling it does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110689	2021-10-01 18:15:56 +01:00
Marcelo Juchem	dfb213c2df	Fix ambiguous overload build failure LLVM (llvmorg-14-init) under Debian sid using latest gcc (Debian 10.3.0-9) 10.3.0 fails due to ambiguous overload on operators == and !=: /root/src/llvm/src/llvm/tools/obj2yaml/elf2yaml.cpp:212:22: error: ambiguous overload for 'operator!=' (operand types are 'llvm::ELFYAML::ELF_SHF' and 'int') /root/src/llvm/src/llvm/tools/obj2yaml/elf2yaml.cpp:204:32: error: ambiguous overload for 'operator!=' (operand types are 'const llvm::yaml::Hex64' and 'int') /root/src/llvm/src/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp:629:35: error: ambiguous overload for 'operator==' (operand types are 'const uint64_t' {aka 'const long unsigned int'} and 'llvm::Register') Reviewed by: StephenTozer, jmorse, Higuoxing Differential Revision: https://reviews.llvm.org/D109534	2021-10-01 14:19:57 +01:00
Sander de Smalen	b62e6f19d7	[SelectionDAG] Handle promotion + widening in getCopyToPartsVector Some vectors require both widening and promotion for their legalization. This case is not yet handled in getCopyToPartsVector and falls back on scalarizing by default. BBecause scalable vectors can't easily be scalarised, we need to implement this in two separate stages: 1. Widen the vector. 2. Promote the vector. As part of this patch, PromoteIntRes_CONCAT_VECTORS also needed to be made scalable aware. Instead of falling back on scalarizing the vector (fixed-width only), each sub-part of the CONCAT vector is promoted, and the operation is performed on the type with the widest element type, finally truncating the result to the promoted result type. Differential Revision: https://reviews.llvm.org/D110646	2021-10-01 08:19:47 +01:00
Christopher Tetreault	3077bc90de	[NFC] Restore magic and magicu to a globally visible location While these functions are only used in one location in upstream, it has been reused in multiple downstreams. Restore this file to a globally visibile location (outside of APInt.h) to eliminate donwstream breakage and enable potential future reuse. Additionally, this patch renames types and cleans up clang-tidy issues.	2021-09-30 17:43:12 -07:00
Amara Emerson	ca8316b704	[GlobalISel] Extend CombinerHelper::matchConstantOp() to match constant splat vectors. This allows the "x op 0 -> x" fold to optimize vector constant RHSs. Differential Revision: https://reviews.llvm.org/D110802	2021-09-30 14:31:25 -07:00
Amara Emerson	80f4bb5c61	[GlobalISel] Extend G_SELECT of known condition combine to vectors. Adds a new utility function: isConstantOrConstantSplatVector(). Differential Revision: https://reviews.llvm.org/D110786	2021-09-30 12:16:44 -07:00
Kazu Hirata	f631173d80	[llvm] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-09-30 08:51:21 -07:00
Brock Wyma	bafd8b1add	[CodeView] Recognize Fortran95 as Fortran instead of MASM Map Fortran95 sources to Fortran so the CodeView language is not emitted as MASM. Differential Revision: https://reviews.llvm.org/D110330	2021-09-30 09:27:05 -04:00
Jay Foad	156d7d2df7	[LiveIntervals] Remove unused subreg ranges in repairIntervalsInRange If the old instructions mentioned a subreg that the new instructions do not, remove the subrange for that subreg. For example, in TwoAddressInstructionPass::eliminateRegSequence, if a use operand in the REG_SEQUENCE has the undef flag then we don't generate a copy for it so after the elimination there should be no live interval at all for the corresponding subreg of the def. This is a small step towards switching TwoAddressInstructionPass over from LiveVariables to LiveIntervals. Currently this path is only tested if you explicitly enable -early-live-intervals. Differential Revision: https://reviews.llvm.org/D110542	2021-09-30 09:15:10 +01:00
Sander de Smalen	6709b193ea	[SelectionDAG] Make WidenVecRes_EXTRACT_SUBVECTOR work for scalable vectors. The legalizer handles this by breaking up an EXTRACT_SUBVECTOR into smaller parts, and combines those together, padding the result with UNDEF vectors, e.g. nxv6i64 extract_subvector(nxv12i64, 6) <-> nxv8i64 concat( nxv2i64 extract_subvector(nxv16i64, 6) nxv2i64 extract_subvector(nxv16i64, 8) nxv2i64 extract_subvector(nxv16i64, 10) nxv2i64 undef) Reviewed By: frasercrmck, david-arm Differential Revision: https://reviews.llvm.org/D110253	2021-09-29 11:33:45 +01:00
Jay Foad	27179b39f9	[RemoveRedundantDebugValues] Enable machine verification after this pass Machine verification after RemoveRedundantDebugValues has been disabled since the pass was first added in D105279, but I guess this was just due to copy-and-paste. Enabling it does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110688	2021-09-29 10:44:35 +01:00
Itay Bookstein	7255ce30e4	[SelectionDAG] Fix incorrect condition for shift amount truncation Comment says: // If the operand is larger than the shift count type but the shift // count type has enough bits to represent any shift value ... It clearly talks about the shifted operand, not the shift-amount operand, but the comparison is performed against Log2_32_Ceil(Op2.getValueSizeInBits()) where Op2 is the shift amount operand. This comparison also doesn't make sense in the context of the previous one (ShiftsSize > Op2Size) because Op2Size == Op2.getValueSizeInBits(). Fix to use Op1. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110509	2021-09-28 17:52:30 -07:00
Jessica Paquette	15a24e1fdb	[GlobalISel] Combine mulo x, 2 -> addo x, x Similar to what SDAG does when it sees a smulo/umulo against 2 (see: `DAGCombiner::visitMULO`) This pattern is fairly common in Swift code AFAICT. Here's an example extracted from a Swift testcase: https://godbolt.org/z/6cT8Mesx7 Differential Revision: https://reviews.llvm.org/D110662	2021-09-28 16:59:43 -07:00
Arthur Eubanks	aa53785f23	Reland [clang] Rework dontcall attributes To avoid using the AST when emitting diagnostics, split the "dontcall" attribute into "dontcall-warn" and "dontcall-error", and also add the frontend attribute value as the LLVM attribute value. This gives us all the information to report diagnostics we need from within the IR (aside from access to the original source). One downside is we directly use LLVM's demangler rather than using the existing Clang diagnostic pretty printing of symbols. Previous revisions didn't properly declare the new dependencies. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110364	2021-09-28 15:31:30 -07:00
Shoaib Meenai	f9b3c18e74	[CodeGen] Fix wrapping personality symbol on ARM The ARM backend was explicitly setting global binding on the personality symbol. This was added without any comment in `a7ec2dcefd`, which introduced EHABI support (back in 2011). None of the other backends do anything equivalent, as far as I can tell. This causes problems when attempting to wrap the personality symbol. Wrapped symbols are marked as weak inside LTO to inhibit IPO (see https://reviews.llvm.org/D33621). When we wrap the personality symbol, it initially gets weak binding, and then the ARM backend attempts to change the binding to global, which causes an error in MC because of attempting to change the binding of a symbol from non-global to global (the error was added in https://reviews.llvm.org/D90108). Simply drop the ARM backend's explicit global binding setting to fix this. This matches all the other backends, and a large internal application successfully linked and ran with this change, so it shouldn't cause any problems. Test via LLD, since wrapping is required to exhibit the issue. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D110609	2021-09-28 15:01:05 -07:00
Arthur Eubanks	7833d20f1f	Revert "[clang] Rework dontcall attributes" This reverts commit `2943071e2e`. Breaks bots	2021-09-28 14:49:27 -07:00
Arthur Eubanks	2943071e2e	[clang] Rework dontcall attributes To avoid using the AST when emitting diagnostics, split the "dontcall" attribute into "dontcall-warn" and "dontcall-error", and also add the frontend attribute value as the LLVM attribute value. This gives us all the information to report diagnostics we need from within the IR (aside from access to the original source). One downside is we directly use LLVM's demangler rather than using the existing Clang diagnostic pretty printing of symbols. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110364	2021-09-28 14:21:10 -07:00
Jay Foad	845b93e692	[LiveIntervals] Fix another asan debug build failure Call RemoveMachineInstrFromMaps before erasing instrs. repairIntervalsInRange will do this for you after erasing the instruction, but it's not safe to rely on it because assertions in SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased instruction. This fixes asan buildbot failures caused by D110335.	2021-09-28 11:09:38 +01:00
“bhkumarn”	62eeacce17	[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran namelist variables. DICompositeType is extended to support this fortran feature. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D108553	2021-09-28 14:40:58 +05:30
Jay Foad	20c0280733	[LiveIntervals] Repair subreg ranges in processTiedPairs In TwoAddressInstructionPass::processTiedPairs, update subranges of the live interval for RegB as well as the main range. This is a small step towards switching TwoAddressInstructionPass over from LiveVariables to LiveIntervals. Currently this path is only tested if you explicitly enable -early-live-intervals. Differential Revision: https://reviews.llvm.org/D110526	2021-09-28 08:10:16 +01:00
Jay Foad	b2b1a8b833	[LiveIntervals] Improve repair after convertToThreeAddress After TwoAddressInstructionPass calls TargetInstrInfo::convertToThreeAddress, improve the LiveIntervals repair to cope with convertToThreeAddress creating more than one new instruction. This mostly seems to benefit X86. For example in test/CodeGen/X86/zext-trunc.ll it converts: %4:gr32 = ADD32rr %3:gr32(tied-def 0), %2:gr32, implicit-def dead $eflags to: undef %6.sub_32bit:gr64 = COPY %3:gr32 undef %7.sub_32bit:gr64_nosp = COPY %2:gr32 %4:gr32 = LEA64_32r killed %6:gr64, 1, killed %7:gr64_nosp, 0, $noreg Differential Revision: https://reviews.llvm.org/D110335	2021-09-28 08:10:08 +01:00
Xiang1 Zhang	ebe9944a34	[ISel] Legalized arithmetic.fence.f128 for 32-bits target Reviewed By: Craig Topper, Wang Pengfei Differential Revision: https://reviews.llvm.org/D110467	2021-09-28 10:27:25 +08:00
Fraser Cormack	e2b46e336b	[DAGCombiner][VP] Fold zero-length or false-masked VP ops This patch adds a generic DAGCombine for vector-predicated (VP) nodes. Those for which we can determine that no vector element is active can be replaced by either undef or, for reductions, the start value. This is tested rather trivially at the IR level, where it's possible that we want to teach instcombine to perform this optimization. However, we can also see the zero-evl case arise during SelectionDAG legalization, when wide VP operations can be split into two and the upper operation emerges as trivially false. It's possible that we could perform this optimization "proactively" (both on legal vectors and before splitting) and reduce the width of an operation and insert it into a larger undef vector: ``` v8i32 vp_add x, y, mask, 4 -> v8i32 insert_subvector (v8i32 undef), (v4i32 vp_add xsub, ysub, mask, 4), i32 0 ``` This is somewhat analogous to similar vector narrow/widening optimizations, but it's unclear at this point whether that's beneficial to do this for VP ops for any/all targets. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D109148	2021-09-27 11:30:09 +01:00
Simon Pilgrim	18c8ed5416	[DAG] ReduceLoadOpStoreWidth - replace getABITypeAlign with allowsMemoryAccess (PR45116) One of the cases identified in PR45116 - we don't need to limit store narrowing to ABI alignment, we can use allowsMemoryAccess - which tests using getABITypeAlign, but also checks if a target permits (fast) misaligned memory access by checking allowsMisalignedMemoryAccesses as a fallback.	2021-09-25 18:35:57 +01:00
Simon Pilgrim	6bd5b1b1ce	[DAG] combineShiftToMULH - move getValueType() inside assert. NFCI. Avoids an unnecessary (void).	2021-09-25 11:56:35 +01:00
David Blaikie	5cb210862b	DebugInfo: Use the signedness of the underlying enum when encoding enum non-type-template-parameters This improves the accuracy of the debug info and improves round tripping through -gsimple-template-names.	2021-09-24 17:02:55 -07:00
Jay Foad	ac51ad24a7	[LiveIntervals] Fix asan debug build failures Call RemoveMachineInstrFromMaps before erasing instrs. repairIntervalsInRange will do this for you after erasing the instruction, but it's not safe to rely on it because assertions in SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased instruction. This fixes asan buildbot failures caused by D110328.	2021-09-24 19:14:57 +01:00
Stanislav Mekhanoshin	08d7eec06e	Revert "Allow rematerialization of virtual reg uses" Reverted due to two distcint performance regression reports. This reverts commit `92c1fd19ab`.	2021-09-24 10:26:11 -07:00
Jay Foad	e4e95f14f1	[LiveIntervals] Repair live intervals that gain subranges In repairIntervalsInRange, if the new instructions refer to subregs but the old instructions did not, make sure any existing live interval for the superreg is updated to have subranges. Also skip repairing any range that we have recalculated from scratch, partly for efficiency but also to avoids some cases that repairOldRegInRange can't handle. The existing test/CodeGen/AMDGPU/twoaddr-regsequence.mir provides some test coverage for this change: when TwoAddressInstructionPass converts REG_SEQUENCE into subreg copies, the live intervals will now get subranges and MachineVerifier will verify that the subranges are correct. Unfortunately MachineVerifier does not complain if the subranges are not present, so the test also passed before this patch. This patch also fixes ~800 of the ~1500 failures in the whole CodeGen lit test suite when -early-live-intervals is forced on. Differential Revision: https://reviews.llvm.org/D110328	2021-09-24 11:58:08 +01:00
Jay Foad	7863cc6c1c	[LiveIntervals] Fix repairOldRegInRange for simple def cases The fix applied in D23303 "LiveIntervalAnalysis: fix a crash in repairOldRegInRange" was over-zealous. It would bail out when the end of the range to be repaired was in the middle of the first segment of the live range of Reg, which was always the case when the range contained a single def of Reg. This patch fixes it as suggested by Matthias Braun in post-commit review on the original patch, and tests it by adding -early-live-intervals to a selection of existing lit tests that now pass. (Note that D23303 was originally applied to fix a crash in SILoadStoreOptimizer, but that is now moot since D23814 updated SILoadStoreOptimizer to run before scheduling so it no longer has to update live intervals.) Differential Revision: https://reviews.llvm.org/D110238 Unrevert with some changes to the tests: - Add -verify-machineinstrs to check for remaining problems in live interval support in TwoAddressInstructionPass. - Drop test/CodeGen/AMDGPU/extract-load-i1.ll since it suffers from some of those remaining problems.	2021-09-24 11:44:49 +01:00
Amara Emerson	9f773b17c2	[GlobalISel][IRTranslator] Fix crash during bit-test switch optimization with odd types. Odd switch case types cause a crash in the conversion to MVT. Instead use a pointer sized scalar type which is what SDAG does in these cases.	2021-09-24 00:19:27 -07:00
Matt Arsenault	2875d3d484	RegAllocGreedy: Remove an unhelpful auto, and don't use a reference	2021-09-23 17:25:25 -04:00
Jay Foad	deb2ca566a	Revert "[LiveIntervals] Fix repairOldRegInRange for simple def cases" This reverts commit `8229cb7412`. It was failing on buildbots with expensive checks enabled.	2021-09-23 17:55:05 +01:00
Jay Foad	8229cb7412	[LiveIntervals] Fix repairOldRegInRange for simple def cases The fix applied in D23303 "LiveIntervalAnalysis: fix a crash in repairOldRegInRange" was over-zealous. It would bail out when the end of the range to be repaired was in the middle of the first segment of the live range of Reg, which was always the case when the range contained a single def of Reg. This patch fixes it as suggested by Matthias Braun in post-commit review on the original patch, and tests it by adding -early-live-intervals to a selection of existing lit tests that now pass. (Note that D23303 was originally applied to fix a crash in SILoadStoreOptimizer, but that is now moot since D23814 updated SILoadStoreOptimizer to run before scheduling so it no longer has to update live intervals.) Differential Revision: https://reviews.llvm.org/D110238	2021-09-23 17:16:14 +01:00
Craig Topper	d5c67bba62	[RegAlloc] Cast uint8_t to unsigned before printing it. raw_ostream interprets uint8_t as wanting to print a character with that ASCII value. In this case the uint8_t is an integer that we want to print.	2021-09-23 08:49:44 -07:00
Simon Pilgrim	2a5936faf0	[CodeGen] ProcessSDDbgValues - use const-ref value in for-range loop. NFCI. Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-23 12:23:46 +01:00
Simon Pilgrim	5cabe4d9d3	[CodeGen] RegisterCoalescer::buildVRegToDbgValueMap - use const-ref value in for-range loop. NFCI. Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-23 12:23:45 +01:00
Fraser Cormack	e7c879a69d	[RISCV][VP] Add support for VP_REDUCE_* operations This patch adds codegen support for lowering the vector-predicated reduction intrinsics to RVV instructions. The process is similar to that of the other reduction intrinsics, save for the fact that every VP reduction has a start value. We reuse the existing custom "VL" nodes, adding extra patterns where required to handle non-true masks. To support these nodes, the `RISCVISD::VECREDUCE_*_VL` nodes have been given an explicit "merge" operand. This is to faciliate the VP reductions, where we must be careful to ensure that even if no operation is performed (when VL=0) we still produce the start value. The RVV reductions don't update the destination register under these conditions, so we tie the splatted start value to the output register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D107657	2021-09-23 11:11:05 +01:00
Jay Foad	6cef28ed2d	[TII] Remove the MFI argument to convertToThreeAddress. NFC. This simplifies the API and addresses a FIXME in TwoAddressInstructionPass::convertInstTo3Addr. Differential Revision: https://reviews.llvm.org/D110229	2021-09-23 08:58:46 +01:00
Bjorn Pettersson	c3ae8ecb52	[DAGCombiner] Rename isAlias as mayAlias. NFC Differential Revision: https://reviews.llvm.org/D110062	2021-09-23 09:54:42 +02:00
Freddy Ye	13207a21a6	[NFC] Remove redundant setOperationAction. [FROUND,FROUNDEVEN][f32, f64, f128] are set Expand twice. Differential Revision: https://reviews.llvm.org/D110302	2021-09-23 10:28:21 +08:00
David Green	c49611f909	Mark CFG as preserved in TypePromotion and InterleaveAccess passes Neither of these passes modify the CFG, allowing us to preserve DomTree and LoopInfo across them by using setPreservesCFG. Differential Revision: https://reviews.llvm.org/D110161	2021-09-22 18:58:00 +01:00
Daniil Fukalov	1a7b7d7ba2	[NFCI][CodeGen, AArch64] Fix inconsistent TargetCostKind types. The pass uses different cost kinds to estimate "old" and "interleaved" costs: default cost kind for all targets override `getInterleavedMemoryOpCost()` is `TCK_SizeAndLatency`. Although at the moment estimated `TCK_Latency` costs are equal to `TCK_SizeAndLatency`, (so the change is NFC) it may change in future. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110100	2021-09-22 20:15:17 +03:00
Hongtao Yu	d9b511d8e8	[CSSPGO] Set PseudoProbeInserter as a default pass. Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110209	2021-09-22 09:09:48 -07:00
Kazu Hirata	3c557cd7f9	[CodeGen] Remove redundant declaration MIRCanonicalizerID (NFC) Note that MIRCanonicalizerID is declared in llvm/include/llvm/CodeGen/Passes.h, which MIRCanonicalizerPass.cpp includes. Identified with readability-redundant-declaration.	2021-09-22 08:58:27 -07:00
Sander de Smalen	3e8d2008f7	[SelectionDAG] Remove PromoteIntOp_EXTRACT_SUBVECTOR. This code seems untested and is likely obsolete, because this case should already be handled by the code that legalizes the result type of EXTRACT_SUBVECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110061	2021-09-22 14:23:35 +01:00
Sander de Smalen	d5681f1d68	[SelectionDAG] Add PromoteIntOp_INSERT_SUBVECTOR. This is required to codegen something like: <vscale x 8 x i16> @llvm.experimental.vector.insert(<vscale x 8 x i16> %vec, <vscale x 2 x i16> %subvec, i64 %idx) where the output vector is legal, but the input vector needs promoting. It implements this by performing the whole operation on the promoted type, and then truncating the result. Reviewed By: david-arm, craig.topper Differential Revision: https://reviews.llvm.org/D110059	2021-09-22 13:32:36 +01:00
Sander de Smalen	4ca1fbe361	[SelectionDAG] Make WidenVecRes_Convert work for scalable vectors. Most of the code wasn't yet scalable safe, although most of the code conceptually just works for scalable vectors. This change makes the algorithm work on ElementCount, where appropriate, and leaves the fixed-width only code to use `getFixedNumElements`. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D110058	2021-09-22 10:58:38 +01:00
Arthur Eubanks	e42234383e	Make DiagnosticInfoResourceLimit's limit param required And always print it. This makes some LLVM diagnostics match up better with Clang's diagnostics. Updated some AMDGPU uses of DiagnosticInfoResourceLimit and now we print better diagnostics for those. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D110204	2021-09-21 15:27:58 -07:00
Craig Topper	aeb63d464f	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for and/or/xor. This requires a minor change to CodeGenPrepare to ensure that shouldSinkOperands will be called for And. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D110106	2021-09-21 10:07:29 -07:00
Michael Liao	5fb3ae525f	[SelectionDAG] Re-calculate scoped AA metadata when merging stores. Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D102821	2021-09-21 11:41:17 -04:00
Aleksandr Bezzubikov	624e4d087e	[GlobalISel] Support ConstantAsMetadata in IRTranslator When using instructions which have a MetadataAsValue argument (e.g. some target-specific intrinsics) MD canonicalization strips internal MDNodes with a single ConstantAsMetadata child. That prevented IRTranslator from the proper translation of such a calls.	2021-09-21 11:24:56 -04:00
Simon Pilgrim	20b58855e0	[CodeGen] SelectionDAGBuilder - Use const-ref iterator in for-range loops. NFCI. Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-21 13:01:08 +01:00
Simon Pilgrim	0f83456cf5	[CodeGen] SDDbgValue::getSDNodes() - use const-ref to avoid unnecessary copies. NFCI. Reported by MSVC static analyzer.	2021-09-21 13:01:08 +01:00
Petar Avramovic	8bc7185668	GlobalISel/Utils: Refactor constant splat match functions Add generic helper function that matches constant splat. It has option to match constant splat with undef (some elements can be undef but not all). Add util function and matcher for G_FCONSTANT splat. Differential Revision: https://reviews.llvm.org/D104410	2021-09-21 12:09:35 +02:00
Amara Emerson	7091a7f781	[GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval() for some artifacts. For artifacts excluding G_TRUNC/G_SEXT, which have IR counterparts, we don't seem to have debug users of defs. However, in the legalizer we're always calling MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() which is expensive. In some rare cases, this contributes significantly to unreasonably long compile times when we have lots of artifact combiner activity. To verify this, I added asserts to that function when it actually replaced a debug use operand with undef for these artifacts. On CTMark with both -O0 and -Os and debug info enabled, I didn't see a single case where it triggered. In my measurements I saw around a 0.5% geomean compile-time improvement on -g -O0 for AArch64 with this change. Differential Revision: https://reviews.llvm.org/D109750	2021-09-20 23:34:42 -07:00
Amara Emerson	f9d69a0ab0	[GlobalISel] Implement support for the "trap-func-name" attribute. This attribute calls a function instead of emitting a trap instruction. Differential Revision: https://reviews.llvm.org/D110098	2021-09-20 14:32:01 -07:00
Petar Avramovic	e4c46ddd91	[GlobalISel] Improve elimination of dead instructions in legalizer Add eraseInstr(s) utility functions. Before deleting an instruction collects its use instructions. After deletion deletes use instructions that became trivially dead. This patch clears all dead instructions in existing legalizer mir tests. Differential Revision: https://reviews.llvm.org/D109154	2021-09-20 13:00:58 +02:00
Kazu Hirata	84b07c9b3a	[llvm] Use pop_back_val (NFC)	2021-09-19 13:44:23 -07:00
Kazu Hirata	48719e3b18	[CodeGen] Use make_early_inc_range (NFC)	2021-09-18 09:29:24 -07:00
Kazu Hirata	e2febc2ed4	[llvm] Use drop_begin (NFC)	2021-09-17 09:16:40 -07:00
Simon Pilgrim	4af7643470	[CodeGen] LiveDebug - Use const-ref iterator in for-range loop. NFCI. Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-17 14:04:54 +01:00
Simon Pilgrim	9e70d4e5f2	[AsmPrinter] DebugLocEntry::dump() - Use const-ref iterator in for-range loop. NFCI. Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-17 12:11:54 +01:00
Petar Avramovic	d477a7c2e7	GlobalISel/Utils: Refactor integer/float constant match functions Rework getConstantstVRegValWithLookThrough in order to make it clear if we are matching integer/float constant only or any constant(default). Add helper functions that get DefVReg and APInt/APFloat from constant instr getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal and getIConstantVRegSExtVal. These now only match G_CONSTANT as described in comment. Relevant matchers now return both DefVReg and APInt/APFloat. Replace existing uses of getConstantstVRegValWithLookThrough and getConstantVRegVal with new helper functions. Any constant match is only required in: ConstantFoldBinOp: for constant argument that was bit-cast of float to int getAArch64VectorSplat: AArch64::G_DUP operands can be any constant amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant In other places use integer only constant match. Differential Revision: https://reviews.llvm.org/D104409	2021-09-17 11:22:13 +02:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Kazu Hirata	cfc7402419	[llvm] Use drop_begin (NFC)	2021-09-16 08:46:26 -07:00
Doug Gregor	a773db7d76	Add a command-line flag to control the Swift extended async frame info. Introduce a new command-line flag `-swift-async-fp={auto\|always\|never}` that controls how code generation sets the Swift extended async frame info bit. There are three possibilities: * `auto`: which determines how to set the bit based on deployment target, either statically or dynamically via `swift_async_extendedFramePointerFlags`. * `always`: the default, always set the bit statically, regardless of deployment target. * `never`: never set the bit, regardless of deployment target. Patch by Doug Gregor <dgregor@apple.com> Reviewed By: doug.gregor Differential Revision: https://reviews.llvm.org/D109392	2021-09-16 06:57:45 -07:00
Konstantin Schwarz	d2e66d7fa4	[GlobalISel] Add a combine for and(load , mask) -> zextload This only handles simple masks, not shifted masks, for now. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D109357	2021-09-16 10:42:46 +02:00
Sam Parker	c98a8a09b5	[HardwareLoops] Loop guard intrinsic to recognise zext If a loop count was initially represented by a 32b unsigned int in C then the hardware-loop pass can recognise the loop guard and insert the llvm.test.set.loop.iterations intrinsic. If this was instead a unsigned short/char then clang inserts a zext instruction to expand the loop count to an i32. This patch adds the necessary pattern matching to enable the use of lvm.test.set.loop.iterations in those cases. Patch by: sherwin-dc Differential Revision: https://reviews.llvm.org/D109631	2021-09-16 08:33:16 +01:00
Alok Kumar Sharma	a5b72abc9e	[DebugInfo] Enhance DIImportedEntity to accept children entities New field `elements` is added to '!DIImportedEntity', representing list of aliased entities. This is needed to dump optimized debugging information where all names in a module are imported, but a few names are imported with overriding aliases. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D109343	2021-09-16 10:41:55 +05:30
Ahmed Bougacha	94a2f9cdb6	[GlobalISel] Fix CombinerHelper::isPredecessor for same def/use MI. The doc comment for isPredecessor says: Returns true if \p DefMI precedes \p UseMI or they are the same instruction. And dominates relies on that behavior for its own: Returns true if \p DefMI dominates \p UseMI. By definition an instruction dominates itself. Make both statements correct by fixing isPredecessor. Found by inspection.	2021-09-15 16:45:27 -07:00
Matt Arsenault	87c00878d3	SplitKit: Remove decade old live interval hack This was trying to fixup broken live intervals coming out of the coalescer. The verifier is more complete now and no tests seem to fail without this.	2021-09-15 17:35:59 -04:00
Amara Emerson	5ec1845cad	[AArch64][GlobalISel] Add a new reassociation for G_PTR_ADDs. G_PTR_ADD (G_PTR_ADD X, C), Y) -> (G_PTR_ADD (G_PTR_ADD(X, Y), C) Improves CTMark -Os on AArch64: Program before after diff sqlite3 286932 287024 0.0% kc 432512 432508 -0.0% SPASS 412788 412764 -0.0% pairlocalalign 249460 249416 -0.0% bullet 475740 475512 -0.0% 7zip-benchmark 568864 568356 -0.1% consumer-typeset 419088 418648 -0.1% tramp3d-v4 367628 367224 -0.1% clamscan 383184 382732 -0.1% lencod 430028 429284 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109528	2021-09-14 23:57:41 -07:00
Matt Arsenault	54d755a034	DAG: Fix incorrect folding of fmul -1 to fneg The fmul is a canonicalizing operation, and fneg is not so this would break denormals that need flushing and also would not quiet signaling nans. Fold to fsub instead, which is also canonicalizing.	2021-09-14 21:25:02 -04:00
Matt Arsenault	4a36e96c3f	RegAllocGreedy: Account for reserved registers in num regs heuristic This simple heuristic uses the estimated live range length combined with the number of registers in the class to switch which heuristic to use. This was taking the raw number of registers in the class, even though not all of them may be available. AMDGPU heavily relies on dynamically reserved numbers of registers based on user attributes to satisfy occupancy constraints, so the raw number is highly misleading. There are still a few problems here. In the original testcase that made me notice this, the live range size is incorrect after the scheduler rearranges instructions, since the instructions don't have the original InstrDist offsets. Additionally, I think it would be more appropriate to use the number of disjointly allocatable registers in the class. For the AMDGPU register tuples, there are a large number of registers in each tuple class, but only a small fraction can actually be allocated at the same time since they all overlap with each other. It seems we do not have a query that corresponds to the number of independently allocatable registers. Relatedly, I'm still debugging some allocation failures where overlapping tuples seem to not be handled correctly. The test changes are mostly noise. There are a handful of x86 tests that look like regressions with an additional spill, and a handful that now avoid a spill. The worst looking regression is likely test/Thumb2/mve-vld4.ll which introduces a few additional spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll shows a massive improvement by completely eliminating a large number of spills inside a loop.	2021-09-14 21:00:29 -04:00
Bjorn Pettersson	cd2bff1ef1	[StackColoring] Fix a debug invariance problem Ignore dbg instructions when collecting stack slot markers. This is to make sure the coloring is invariant regarding presence of dbg instructions (even in cases when the dbg instructions might be badly placed in the input). Differential Revision: https://reviews.llvm.org/D109758	2021-09-14 19:21:56 +02:00
vnalamot	726b5d3416	[RegScavenger][NFC] Refer to the already initialized local variable for spill slot index Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D109501	2021-09-13 21:55:33 +05:30
Simon Pilgrim	9db20822f7	[APInt] Add APIntOps::ScaleBitMask helper APInt is used to describe a bit mask in a variety of value tracking and demanded bits/elts functions. When traversing through dst/src operands, we have a number of places where these masks need to widened/narrowed to translate through bitcasts, reductions etc. to a different type. This patch add a APIntOps::ScaleBitMask common helper, adds unit test coverage, and updates a number of cases to use the the helper instead of their own implementation. This came up on D109065 where we currently have to add yet another implementation of the same code. Differential Revision: https://reviews.llvm.org/D109683	2021-09-13 16:27:12 +01:00
vnalamot	0fc3ebb70a	[SelectionDAG][NFC] Fix typo in VerifyDAGDiverence() function name Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D109674	2021-09-13 20:48:04 +05:30
David Truby	915e9e76bf	[llvm][sve] Lowering for VLS masked extending loads This extends the custom lowering for extending loads on fixed length vectors in SVE to support masked extending loads. The existing tests for correct behaviour of masked extending loads exhibit bad code generation due to the legalistaion of i1 vectors. They have been left as-is and new tests have been added that do not exhibit this behaviour. Differential Revision: https://reviews.llvm.org/D108200	2021-09-13 11:13:25 +01:00
Nikita Popov	4189e5fe12	[CGP] Support opaque pointers in address mode fold Rather than inspecting the pointer element type, use the access type of the load/store/atomicrmw/cmpxchg. In the process of doing this, simplify the logic by storing the address + type in MemoryUses, rather than an Instruction + Operand pair (which was then used to fetch the address).	2021-09-12 17:43:37 +02:00
Kazu Hirata	c9fca53af1	[CodeGen, Target] Use pred_empty and succ_empty (NFC)	2021-09-10 11:11:31 -07:00
Nikita Popov	14afbe9448	[CallLowering] Support opaque pointers Always use the byval/inalloca/preallocated type (which is required nowadays), don't fall back on the pointer element type. This requires adding Function::getParamPreallocatedType() to mirror the CallBase API, so that the templated code can work with both.	2021-09-10 18:32:12 +02:00
Sander de Smalen	ec7d8d5069	[SelectionDAG] PromoteIntRes_EXTRACT_SUBVECTOR for scalable vectors (widening). This patch implements legalization of EXTRACT_SUBVECTOR for the case where the result needs promoting, and the input type requires widening. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D109509	2021-09-10 13:29:26 +01:00
Sander de Smalen	801a745dd2	[SelectionDAG] PromoteIntRes_EXTRACT_SUBVECTOR for scalable vectors. This patch implements legalization of EXTRACT_SUBVECTOR for the case where the result needs promoting, and the input type is either legal or requires splitting. The idea is that the operation is broken down into simpler steps, by first extracting a smaller subvector until the input vector becomes legal or requires promotion. Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D109313	2021-09-10 13:29:26 +01:00
Zequan Wu	12f80c0bbd	[DebugInfo] Emit DW_AT_inline under -g1/-gmlt Differential Revision: https://reviews.llvm.org/D109554	2021-09-09 18:59:50 -07:00
Craig Topper	9af8f1b18e	[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode. Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D109535	2021-09-09 13:28:30 -07:00
Craig Topper	517728fe1e	[SelectionDAG] Use DAG.getNOT to further simplify some code. NFC Followup to D109483	2021-09-09 10:53:39 -07:00
Nick Desaulniers	e69d402088	[NFC] rename member of BitTestBlock and JumpTableHeader Follow up to suggestions in D109103 via hans: I think UnreachableDefault (or UnreachableFallthrough) would be a better name now, since it doesn't just omit the range check, it also omits the last bit test. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D109455	2021-09-09 10:43:00 -07:00
Chris Lattner	d51da74889	[CodeGen] Use DAG.getAllOnesConstant where possible to simplify code. NFC.	2021-09-09 10:22:51 -07:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Chris Lattner	9e46dd965a	[APInt.h] Reduce the APInt header file interface a bit. NFC This moves one mid-size function out of line, inlines the trivial tcAnd/tcOr/tcXor/tcComplement methods into their only caller, and moves the magic/umagic functions into SelectionDAG since they are implementation details of its algorithm. This also removes the unit tests for magic, but these are already tested in the divide lowering logic for various targets. This also upgrades some C style comments to C++. Differential Revision: https://reviews.llvm.org/D109476	2021-09-08 18:17:07 -07:00
Amara Emerson	eae44c8a86	[GlobalISel] Implement merging of stores of truncates. This is a port of a combine which matches a pattern where a wide type scalar value is stored by several narrow stores. It folds it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; => ((i32)p) = val; On CTMark AArch64 -Os this results in a good amount of savings: Program before after diff SPASS 412792 412788 -0.0% kc 432528 432512 -0.0% lencod 430112 430096 -0.0% consumer-typeset 419156 419128 -0.0% bullet 475840 475752 -0.0% tramp3d-v4 367760 367628 -0.0% clamscan 383388 383204 -0.0% pairlocalalign 249764 249476 -0.1% 7zip-benchmark 570100 568860 -0.2% sqlite3 287628 286920 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109419	2021-09-08 17:06:33 -07:00
Nick Desaulniers	4331f19d8b	[ISEL][BitTestBlock] omit additional bit test when default destination is unreachable Otherwise we end up with an extra conditional jump, following by an unconditional jump off the end of a function. ie. bb.0: BT32rr .. JCC_1 %bb.4 ... bb.1: BT32rr .. JCC_1 %bb.2 ... JMP_1 %bb.3 bb.2: ... bb.3.unreachable: bb.4: ... Should be equivalent to: bb.0: BT32rr .. JCC_1 %bb.4 ... JMP_1 %bb.2 bb.1: bb.2: ... bb.3.unreachable: bb.4: ... This can occur since at the higher level IR (Instruction) SwitchInsts are required to have BBs for default destinations, even when it can be deduced that such BBs are unreachable. For most programs, this isn't an issue, just wasted instructions since the unreachable has been statically proven. The x86_64 Linux kernel when built with CONFIG_LTO_CLANG_THIN=y fails to boot though once D106056 is re-applied. D106056 makes it more likely that correlation-propagation (CVP) can deduce that the default case of SwitchInsts are unreachable. The x86_64 kernel uses a binary post processor called objtool, which emits this warning: vmlinux.o: warning: objtool: cfg80211_edmg_chandef_valid()+0x169: can't find jump dest instruction at .text.cfg80211_edmg_chandef_valid+0x17b I haven't debugged precisely why this causes a failure at boot time, but fixing this very obvious jump off the end of the function fixes the warning and boot problem. Link: https://bugs.llvm.org/show_bug.cgi?id=50080 Fixes: https://github.com/ClangBuiltLinux/linux/issues/679 Fixes: https://github.com/ClangBuiltLinux/linux/issues/1440 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D109103	2021-09-08 11:03:47 -07:00
David Green	d8d24c64fe	[DAG] Fix GT -> GE condition when creating SetCC `79845ed6df` folded some setcc(ashr) conditions to setcc, but got the condition for NE incorrect, using GT where it should be using GE.	2021-09-08 12:41:51 +01:00
Evgeny Leviant	93b09a2a5d	[LiveDebugValues] Handle spills of indirect debug values correctly When handling register spill for indirect debug value LiveDebugValues pass doesn't add DW_OP_deref operator which may in some cases cause debugger to return value address, instead of value while machine register holding that address is spilled. Differential revision: https://reviews.llvm.org/D109142	2021-09-08 14:06:08 +03:00
Fraser Cormack	2c5568a6a9	[LegalizeTypes][VP] Add promotion support for binary VP ops This patch extends the preliminary support for vector-predicated (VP) operation legalization to include promotion of illegal integer vector types. Integer promotion of binary VP operations is relatively simple and piggy-backs on the non-VP logic, but passing the two extra mask and VP operands through to the promoted operation. Tests have been added to the RISC-V target to cover the basic scenarios for integer promotion for both fixed- and scalable-vector types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108288	2021-09-08 10:22:57 +01:00
Peter Smith	5e71839f77	[MC] Add MCSubtargetInfo to MCAlignFragment In preparation for passing the MCSubtargetInfo (STI) through to writeNops so that it can use the STI in operation at the time, we need to record the STI in operation when a MCAlignFragment may write nops as padding. The STI is currently unused, a further patch will pass it through to writeNops. There are many places that can create an MCAlignFragment, in most cases we can find out the STI in operation at the time. In a few places this isn't possible as we are in initialisation or finalisation, or are emitting constant pools. When possible I've tried to find the most appropriate existing fragment to obtain the STI from, when none is available use the per module STI. For constant pools we don't actually need to use EmitCodeAlign as the constant pools are data anyway so falling through into it via an executable NOP is no better than falling through into data padding. This is a prerequisite for D45962 which uses the STI to emit the appropriate NOP for the STI. Which can differ per fragment. Note that involves an interface change to InitSections. It is now called initSections and requires a SubtargetInfo as a parameter. Differential Revision: https://reviews.llvm.org/D45961	2021-09-07 15:46:19 +01:00
Mirko Brkusanin	6c4b634da6	[AMDGPU][GlobalISel] Legalize G_MUL for non-standard types Legalizing G_MUL for non-standard types (like i33) generated an error. Putting minScalar and maxScalar instead of clampScalar. Also using new rule, instead of widening to the next power of 2, widen to the next multiple of the passed argument (32 in this case), so instead of widening i65 to i128, we widen it to i96. Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D109228	2021-09-07 16:33:24 +02:00
Mirko Brkusanin	5263bf583a	[AMDGPU][GlobalISel] Legalization of G_ROTL and G_ROTR Add implementation for the legalization of G_ROTL and G_ROTR machine instructions. They are very similar to funnel shift instructions, the only difference is funnel shifts have 3 operands, whereas rotate instructions have two operands, the first being the register that is being rotated and the second being the number of shifts. The legalization of G_ROTL/G_ROTR is just lowering them into funnel shift instructions if they are legal. Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D105347	2021-09-07 16:33:24 +02:00
Mirko Brkusanin	36527cbe02	[AMDGPU][GlobalISel] Legalize memcpy family of intrinsics Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE. Corresponding intrinsics are replaced by a loop that uses loads/stores in AMDGPULowerIntrinsics pass unless their length is a constant lower then MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that reaches legalizer should have a const length argument and should be expanded into appropriate number of loads + stores. Differential Revision: https://reviews.llvm.org/D108357	2021-09-07 12:24:07 +02:00
Fraser Cormack	a823bdf3ab	[RISCV][VP] Custom lower VP_STORE and VP_LOAD This patch adds support for the vector-predicated `VP_STORE` and `VP_LOAD` nodes. We do this in the same way we lower `MSTORE` and `MLOAD`: to regular load/store instructions via intrinsics. One necessary change was made to `SelectionDAGLegalize` so that `VP_STORE` nodes' operation actions are taken from the stored "value" operands, in the same vein as `STORE` or `MSTORE`. Reviewed By: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D108999	2021-09-07 10:53:25 +01:00
Fraser Cormack	f4dee8cb82	[RISCV][VP] Custom lower VP_SCATTER and VP_GATHER This patch adds support for the `VP_SCATTER` and `VP_GATHER` nodes by lowering them to RVV's `vsox`/`vlux` instructions, respectively. This process is almost identical to the existing `MSCATTER`/`MGATHER` support. One extra change was made to `SelectionDAGLegalize` so that `VP_SCATTER`'s operation action is derived from its stored "value" operand rather than its return type (which is always the chain). Reviewed By: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D108987	2021-09-07 10:43:07 +01:00
Sanjay Patel	e1e4bf174b	[DAGCombine] Prevent the transform of combine for multi-use operand The test is based on a miscompile example in: https://llvm.org/PR51321 Differential Revision: https://reviews.llvm.org/D107692	2021-09-06 15:30:32 -04:00
Jonas Paulsson	118997d8e9	[SelectionDAGBuilder] Bugfix in visitInlineAsm() In case of a virtual register tied to a phys-def, the register class needs to be computed. Make sure that this works generally also with fast regalloc by using TLI.getRegClassFor() whenever possible, and make only the case of 'Untyped' use getMinimalPhysRegClass(). Fixes https://bugs.llvm.org/show_bug.cgi?id=51699. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D109291	2021-09-06 17:46:31 +02:00
David Green	1b83aaaefa	[DAG] Remove oneuse check in select_cc setgt X, -1, C, ~C fold This appears to produce better code, even if the condition may need to be replicated.	2021-09-05 16:18:31 +01:00
David Green	8523fb96a6	[DAG] Fold select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C Given a select_cc producing a constant and a invertion of the constant for a comparison more than zero, we can produce an xor with ashr instead, which produces smaller code. The ashr either sets all bits or clear all bits depending on if the value is negative. This is then xor'd with the constant to optionally negate the value. https://alive2.llvm.org/ce/z/DTFaBZ This includes a OneUseCheck on the Cmp, which seems to make thinks a little worse and will be removed in a followup. Differential Revision: https://reviews.llvm.org/D109149	2021-09-05 16:04:01 +01:00
David Green	79845ed6df	[DAG] Fold setcc eq with ashr to compare to zero. Pulled out of D109149, this folds set_cc seteq (ashr X, BW-1), -1 -> set_cc setlt X, 0 to prevent some regressions later on when folding select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C Differential Revision: https://reviews.llvm.org/D109214	2021-09-05 14:06:47 +01:00
Fangrui Song	e03c8d309a	[AsmPrinter] Remove unneeded MCSubtargetInfo temporary after D14346. NFC The temporary object was used as a workaround when the target parser may change STI. D14346 made the MCSubtargetInfo argument to createMCAsmParser const, so we no longer need the temporary object.	2021-09-04 10:50:10 -07:00
Konstantin Schwarz	90d5298759	[GlobalISel] Add convenience constructors to MemDesc This allows constructing a MemDesc from a MachineMemoryOperand, a pattern that starts to show up more frequently. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D109161	2021-09-03 12:52:18 +02:00
Chen Zheng	34badc409c	Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount." This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and is not a right patch according to comments in D91724 This reverts commit `42eaf4fe0a`.	2021-09-03 02:55:43 +00:00
Jessica Paquette	844d8e0337	[GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1 This adds the following combines: ``` x = ... 0 or 1 c = icmp eq x, 1 -> c = x ``` and ``` x = ... 0 or 1 c = icmp ne x, 0 -> c = x ``` When the target's true value for the relevant types is 1. This showed up in the following situation: https://godbolt.org/z/M5jKexWTW SDAG currently supports the `ne` case, but not the `eq` case. This can probably be further generalized, but I don't feel like thinking that hard right now. This gives some minor code size improvements across the board on CTMark at -Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.) Differential Revision: https://reviews.llvm.org/D109130	2021-09-02 15:05:31 -07:00
Heejin Ahn	28780e59f6	[WebAssembly] Add Wasm SjLj support This add support for SjLj using Wasm exception handling instructions: https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md This does not yet support the mixed use of EH and SjLj within a function. It will be added in a follow-up CL. This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s, except for the below: - `test_longjmp_standalone`: Uses Node - `test_dlfcn_longjmp`: Uses NodeRAWFS - `test_longjmp_throw`: Mixes EH and SjLj - `test_exceptions_longjmp1`: Mixes EH and SjLj - `test_exceptions_longjmp2`: Mixes EH and SjLj - `test_exceptions_longjmp3`: Mixes EH and SjLj Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D108960	2021-09-02 10:51:02 -07:00
David Green	9cb8f4d1ad	[ARM] Add a tail-predication loop predicate register The semantics of tail predication loops means that the value of LR as an instruction is executed determines the predicate. In other words: mov r3, #3 DLSTP lr, r3 // Start tail predication, lr==3 VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0. mov lr, #1 VADD.s32 q0, q1, q2 // Only first lane is updated. This means that the value of lr cannot be spilled and re-used in tail predication regions without potentially altering the behaviour of the program. More lanes than required could be stored, for example, and in the case of a gather those lanes might not have been setup, leading to alignment exceptions. This patch adds a new lr predicate operand to MVE instructions in order to keep a reference to the lr that they use as a tail predicate. It will usually hold the zeroreg meaning not predicated, being set to the LR phi value in the MVETPAndVPTOptimisationsPass. This will prevent it from being spilled anywhere that it needs to be used. A lot of tests needed updating. Differential Revision: https://reviews.llvm.org/D107638	2021-09-02 13:42:58 +01:00
Roman Lebedev	3f1f08f0ed	Revert @llvm.isnan intrinsic patchset. Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.) TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile. While the end result of discussion may lead back to the current design, it may also not lead to the current design. Therefore i take it upon myself to revert the tree back to last known good state. This reverts commit `4c4093e6e3`. This reverts commit `0a2b1ba33a`. This reverts commit `d9873711cb`. This reverts commit `791006fb8c`. This reverts commit `c22b64ef66`. This reverts commit `72ebcd3198`. This reverts commit `5fa6039a5f`. This reverts commit `9efda541bf`. This reverts commit `94d3ff09cf`.	2021-09-02 13:53:56 +03:00
Fraser Cormack	ef78f2106c	[LegalizeTypes][VP] Add splitting support for binary VP ops This patch extends D107904's introduction of vector-predicated (VP) operation legalization to include vector splitting. When the result of a binary VP operation needs splitting, all of its operands are split in kind. The two operands and the mask are split as usual, and the vector-length parameter EVL is "split" such that the low and high halves each execute the correct number of elements. Tests have been added to the RISC-V target to show splitting several scenarios for fixed- and scalable-vector types. Without support for `umax` (e.g. in the `B` extension) the generated code starts to branch. Ideally a cost model would prevent their insertion in the first place. Through these tests many opportunities for better codegen can be seen: combining known-undef VP operations and for constant-folding operations on `ISD::VSCALE`, to name but a few. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107957	2021-09-02 10:15:53 +01:00
Abinav Puthan Purayil	0baace5379	[DAGCombine] Add node level checks for fp-contract and fp-ninf in visitFMULForFMADistributiveCombine(). Differential Revision: https://reviews.llvm.org/D107551	2021-09-02 11:33:14 +05:30
Roman Lebedev	f5753125f0	[Codegen][TLI][X86] SimplifyMultipleUseDemandedBits(): 0'th vec subreg widening is free, try to perform it earlier I believe, the profitability reasoning here is correct "sub"reg is already located within the 0'th subreg of wider reg, so if we have suvector insertion at index 0 into undef, then it's always free do to. After this, D109065 finally avoids the regression in D108382. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D109074	2021-09-02 00:54:05 +03:00
Arthur Eubanks	52e6d70c40	[NFC] Use newly introduced *AtIndex methods Introduced in D108788. These are clearer.	2021-09-01 11:18:41 -07:00
Fraser Cormack	85fd44d7fe	[SelectionDAG][NFC] Fix typo in assertion message s/Uexpected/Unexpected.	2021-09-01 08:55:06 +01:00
Yonghong Song	89424a829f	[DWARF] Support new TAG DW_TAG_LLVM_annotation A new LLVM specific TAG DW_TAG_LLVM_annotation is added. The name is suggested by Paul Robinson ([1]). Currently, this tag is used to output __attribute__((btf_tag("string"))) annotations in dwarf. The following is an example for a global variable with two btf_tag attributes: 0x0000002a: DW_TAG_variable DW_AT_name ("g1") DW_AT_type (0x00000052 "int") DW_AT_external (true) DW_AT_decl_file ("/tmp/home/yhs/work/tests/llvm/btf_tag/t.c") DW_AT_decl_line (8) DW_AT_location (DW_OP_addr 0x0) 0x0000003f: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag1") 0x00000048: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag2") 0x00000051: NULL In the future, DW_TAG_LLVM_annotation may encode other type of non-string const value. [1] https://lists.llvm.org/pipermail/llvm-dev/2021-June/151250.html Differential Revision: https://reviews.llvm.org/D106621	2021-08-31 19:22:17 -07:00
Stanislav Mekhanoshin	d170945bb2	[RegAlloc] Immediately delete dead instructions with live uses When RA eliminated a dead def it can either immediately delete the instruction itself or replace it with KILL to defer the actual removal. If this instruction has a virtual register use killing the register it will shrink the LI of the use. However, if the LI covers the instruction and extends beyond it the shrink will not happen. In fact that is impossible to shrink such use because of the KILL still using it. If later the LI of the use will be split at the KILL and the KILL itself is eliminated after that point the new live segment ends up at an invalid slot index. This extremely rare condition was hit after D106408 which has enabled rematerialization of such instructions. The replacement with KILL is only done for rematerialized defs which became dead and such rematerialization did not generally happen before. The patch deletes an instruction immediately if it is a result of rematerialization and has such use. An alternative would be to prohibit a split at a KILL instruction, but it looks like it is better to split a live range rather then keeping a killed instruction just in case it can be rematerialized further. Fixes PR51655. Differential Revision: https://reviews.llvm.org/D108951	2021-08-31 13:46:00 -07:00
Jessica Paquette	94d3ff09cf	[GlobalISel] Don't use G_FPTOSI in G_ISNAN legalization As noted in the comments in D108227, using G_FPTOSI produces wrong results for G_ISNAN. Drop the G_FPTOSI and perform the operation on integer types. Elsewhere in LLVM, a bitcast would be the appropriate choice (as it is in SDAG). GlobalISel does not distinguish between integer and FP types, so a bitcast would be meaningless here.	2021-08-31 10:26:42 -07:00
Hussain Kadhem	524ded7d01	[VP] implementation of sdag support for VP memory intrinsics Followup to D99355: SDAG support for vector-predicated load/store/gather/scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D105871	2021-08-31 17:01:50 +02:00
Nemanja Ivanovic	84d4ed1761	Revert "[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item" This reverts commit `0a6fad754e`. It caused failures on a number of PowerPC bots.	2021-08-31 09:24:50 -05:00
Craig Topper	201f6446da	[LegalizeTypes][X86] Improve ExpandIntRes_FP_TO_SINT/ExpandIntRes_FP_TO_UINT when input is SoftPromoteHalf. Instead of splitting off the fp16 to float conversion and generating a libcall, we should split the operation into fp16 to float and float to integer operations. This will allow the float to integer conversion to go through any custom handling the target has. If the target doesn't have custom handling then we should come back to ExpandIntRes_FP_TO_SINT/ ExpandIntRes_FP_TO_UINT automatically to create the libcall. This avoids generating libcalls on 32-bit X86. These library functions may not exist in 32-bit libgcc. At least for LLVM, we never generate them when hardware floating point instructions are available. Differential Revision: https://reviews.llvm.org/D108933	2021-08-30 13:12:59 -07:00
Bjorn Pettersson	789f01283d	[SelectionDAG] Fix miscompile bugs related to smul.fix.sat with scale zero When expanding a SMULFIXSAT ISD node (usually originating from a smul.fix.sat intrinsic) we've applied some optimizations for the special case when the scale is zero. The idea has been that it would be cheaper to use an SMULO instruction (if legal) to perform the multiplication and at the same time detect any overflow. And in case of overflow we could use some SELECT:s to replace the result with the saturated min/max value. The only tricky part is to know if we overflowed on the min or max value, i.e. if the product is positive or negative. Unfortunately the implementation has been incorrect as it has looked at the product returned by the SMULO to determine the sign of the product. In case of overflow that product is truncated and won't give us the correct sign bit. This patch is adding an extra XOR of the multiplication operands, which is used to determine the sign of the non truncated product. This patch fixes PR51677. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D108938	2021-08-30 22:08:26 +02:00
Chih-Ping Chen	070090cfa5	[DebugInfo] Remove the restriction on the size of DIStringType in DebugHandlerBase::isUnsignedDIType. Differential Revision: https://reviews.llvm.org/D108559	2021-08-30 15:36:54 -04:00
Nikita Popov	0529e2e018	[InstrInfo] Use 64-bit immediates for analyzeCompare() (NFCI) The backend generally uses 64-bit immediates (e.g. what MachineOperand::getImm() returns), so use that for analyzeCompare() and optimizeCompareInst() as well. This avoids truncation for targets that support immediates larger 32-bit. In particular, we can avoid the bugprone value normalization hack in the AArch64 target. This is a followup to D108076. Differential Revision: https://reviews.llvm.org/D108875	2021-08-30 19:46:04 +02:00
Hongtao Yu	f39256e3a5	[CSSPGO] Avoid repeatedly computing md5 hash code for pseudo probe inline contexts. Md5 hashing is expansive. Using a hash map to look up already computed GUID for dwarf names. Saw a 2% build time improvement on an internal large application. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D108722	2021-08-30 10:11:47 -07:00
Kazu Hirata	c50faffb4e	[llvm] Remove redundant calls to str() and c_str() (NFC) Identified with readability-redundant-string-cstr.	2021-08-30 09:05:05 -07:00
Craig Topper	705d005781	[DAGCombiner][RISCV] Don't use vector types in DAGCombiner::tryStoreMergeOfLoads if we need a rotate. The check for whether a rotate is possible occurs before the memory legality checks for the integer type. So it's possible we decide we can use a rotate, but then fail the legality checks. If that happens we should not fall back to a vector type. This triggers an assertion in the rotate handling when it finds a vector type instead of an integer type. In theory we could use a shufflevector in place of the rotate, but right now I'd just like to fix the crash. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D108839	2021-08-30 08:47:15 -07:00
Djordje Todorovic	86f5288eae	[LiveDebugValues] Cleanup Transfers when removing Entry Value If we encounter a new debug value, describing the same parameter, we should stop tracking the parameter's Entry Value. At that point, in some cases, the Transfer which uses the parameter's Entry Value, is already emitted. Thanks to the RemoveRedundantDebugValues pass, many problems with incorrect instruction order and number of DBG_VALUEs are fixed. However, we still cannot rely on the rule that each new debug value is set by the previous non-debug instruction in Machine Basic Block. When new parameter debug value triggers removal of Backup Entry Value for the same parameter, do the cleanup of Transfers emitted from Backup Entry Values. Get the Transfer Instruction which created the new debug value and search for debug values already emitted from the to-be-deleted Backup Entry Value and attached to the Transfer Instruction. If found, delete the Transfer and remove "primary" Entry Value Var Loc from OpenRanges. This patch fixes PR47628. Patch by Nikola Tesic. Differential revision: https://reviews.llvm.org/D106856	2021-08-30 14:00:41 +02:00
Simon Pilgrim	7c25a32840	Fix MSVC "signed/unsigned mismatch" comparison warning. NFCI.	2021-08-30 12:11:09 +01:00

... 2 3 4 5 6 ...

31452 Commits