llvm-project

Commit Graph

Author	SHA1	Message	Date
Jeremy Morse	4501928eb2	Re-land `ae4303b42c`, "Track PHI values through register coalescing" Was reverted in `0507fc2ffc`, in phi-coalesce-subreg.mir I'd explicitly named some passes to run instead of specifying a range. As a result some two-address-instrs weren't correctly rewritten and the verifier got upset. Original commit message: [DebugInstrRef][2/3] Track PHI values through register coalescing In the instruction referencing variable location model, we store variable locations that point at PHIs in MachineFunction during register allocation. Unfortunately, register coalescing can substantially change the locations of registers, and so that PHI-variable-location side table needs maintenence during the pass. This patch builds an index from the side table, and whenever a vreg gets coalesced into another vreg, update the index to record the new vreg that the PHI happens in. It also accepts a limited range of subregister coalescing, for example merging a subregister into a larger class. Differential Revision: https://reviews.llvm.org/D86813	2021-06-04 11:32:02 +01:00
Fraser Cormack	aec9cbbeb8	[SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR This patch extends the SelectionDAG's ability to constant-fold vector arithmetic to include support for SPLAT_VECTOR. This is not only for scalable-vector types but also for fixed-length vector types, which helps Hexagon in a couple of cases. The original RISC-V test case was in fact an infinite DAGCombine loop. The pattern `and (truncate v1), (truncate v2)` can be combined to `truncate (and v1, v2)` but the truncate can similarly be combined back to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or `v2` is a constant vector). It wasn't exposed in on fixed-length types because a TRUNCATE of a constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D103246	2021-06-04 09:53:15 +01:00
Esme-Yi	fbfd717197	[Debug-Info] handle DW_CC_pass_by_value/DW_CC_pass_by_reference under strict DWARF. Summary: When -strict-dwarf=true is specified, the calling convention info DW_CC_pass_by_value or DW_CC_pass_by_reference can only be generated at DWARF5. Reviewed By: shchenz, dblaikie Differential Revision: https://reviews.llvm.org/D103300	2021-06-04 08:14:47 +00:00
Arthur Eubanks	9255a5c1ba	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. Issues can be diagnosed with D103412. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-06-03 15:52:01 -07:00
Brendon Cahoon	53ab2d821e	[GlobalISel] Add G_SBFX/G_UBFX to computeKnownBits Differential Revision: https://reviews.llvm.org/D102969	2021-06-03 16:01:47 -04:00
Eli Friedman	44cdf771fe	[AtomicExpand] Merge cmpxchg success and failure ordering when appropriate. If we're not emitting separate fences for the success/failure cases, we need to pass the merged ordering to the target so it can emit the correct instructions. For the PowerPC testcase, we end up with extra fences, but that seems like an improvement over missing fences. If someone wants to improve that, the PowerPC backed could be taught to emit the fences after isel, instead of depending on fences emitted by AtomicExpand. Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 . Differential Revision: https://reviews.llvm.org/D103342	2021-06-03 11:34:35 -07:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
Jeremy Morse	0507fc2ffc	Revert "[DebugInstrRef][2/3] Track PHI values through register coalescing" This reverts commit `ae4303b42c`. Expensive checks buildbot has found a problem with this: https://lab.llvm.org/buildbot/#/builders/16/builds/11863	2021-06-03 17:16:58 +01:00
Jeremy Morse	ae4303b42c	[DebugInstrRef][2/3] Track PHI values through register coalescing In the instruction referencing variable location model, we store variable locations that point at PHIs in MachineFunction during register allocation. Unfortunately, register coalescing can substantially change the locations of registers, and so that PHI-variable-location side table needs maintenence during the pass. This patch builds an index from the side table, and whenever a vreg gets coalesced into another vreg, update the index to record the new vreg that the PHI happens in. It also accepts a limited range of subregister coalescing, for example merging a subregister into a larger class. Differential Revision: https://reviews.llvm.org/D86813	2021-06-03 17:06:51 +01:00
Fraser Cormack	1de1887f5f	[CodeGen] Fix a scalable-vector crash in VSELECT legalization The `DAGTypeLegalizer::WidenVSELECTMask` function is not (yet) ready for scalable vector types, and has numerous places in which it tries to grab either the fixed size or number of elements of its types. I believe that it should be possible to update this method to properly account for scalable-vector types, but we don't have test cases for that; RISC-V bails out early on as it has legal i1 vector masks. As such, this patch just prevents it from crashing. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103536	2021-06-03 10:24:55 +01:00
Fraser Cormack	2dd20a31f2	[ValueTypes] Fix scalable-vector changeExtendedVectorTypeToInteger The attached tests check for the regression in DAGCombiner's `visitVSELECT`, which may call this method. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103534	2021-06-03 09:36:56 +01:00
Sanjay Patel	0718ac706d	[SDAG] allow cast folding for vector sext-of-setcc with signed compare This extends `434c8e013a` and `ede3982792` to handle signed predicates by sign-extending the setcc operands. This is not shown directly in https://llvm.org/PR50055 , but the pattern is visible by changing the unsigned convert to signed in the source code.	2021-06-02 15:05:02 -04:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Sanjay Patel	ede3982792	[SDAG] allow more cast folding for vector sext-of-setcc This is a follow-up to D103280 that eases the use restrictions, so we can handle the motivating case from: https://llvm.org/PR50055 The loop code is adapted from similar use checks in ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an easier way to filter out non-chain uses of load values. Differential Revision: https://reviews.llvm.org/D103462	2021-06-02 13:14:49 -04:00
Bjorn Pettersson	536e02a23c	[CodeGen] Refactor libcall lookups for RTLIB::POWI_* Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type. This includes a small refactoring of ExpandFPLibCall and ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code, plus adding an ExpandFPLibCall version that can be called directly when expanding FPOWI/STRICT_FPOWI to ensure that we actually use the same RTLIB::Libcall when expanding the libcall as we used when checking the legality of such a call by doing a getLibcallName check. Differential Revision: https://reviews.llvm.org/D103050	2021-06-02 11:40:34 +02:00
Bjorn Pettersson	d1273d39d3	[LegalizeTypes] Avoid promotion of exponent in FPOWI The FPOWI DAG node is normally lowered to a libcall to one of the RTLIB::POWI* runtime functions and the exponent should normally have a type matching sizeof(int) when making the call. Thus, type promotion of the exponent could lead to an FPOWI with a type for the second operand that would be incorrect when doing the libcall (a situation which would be hard to detect post-legalization if we allow such FPOWI nodes). This patch is changing DAGTypeLegalizer::PromoteIntOp_FPOWI to do the rewrite into a libcall directly instead of promoting the operand. This way we can check that the exponent is smaller than sizeof(int) and we can let TargetLowering handle promotion as part of making the libcall. It could be noticed here that makeLibCall has some knowledge about targets such as 64-bit RISCV, for which the libcall argument should be extended to a type larger than sizeof(int). Differential Revision: https://reviews.llvm.org/D102950	2021-06-02 11:40:34 +02:00
Sriraman Tallam	516e5bb2b1	Resubmit D85085 after fixing the tests that were failing. D85085 was pushed earlier but broke tests on mac and win: http://lab.llvm.org:8080/green/job/clang-stage1-RA/21182/consoleFull#-706149783d489585b-5106-414a-ac11-3ff90657619c Recommitting it after adding mtriple to the llc commands. Emit correct location lists with basic block sections. This patch addresses multiple things: 1) It ensures that const_value is emitted when possible with basic block sections. 2) It emits location lists such that the labels are always within the section boundary. 3) It fixes a bug when the parameter is first used in a non-entry block which is in a different section from the entry block. Differential Revision: https://reviews.llvm.org/D85085	2021-06-01 21:59:47 -07:00
Daniel Sanders	9372662050	fixup: Missing operator in [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one My local compiler was fine with it but the bots complain about ambiguous types.	2021-06-01 13:58:03 -07:00
Daniel Sanders	aaac268285	[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one It's still in use in a few places so we can't delete it yet but there's not many at this point. Differential Revision: https://reviews.llvm.org/D103352	2021-06-01 13:23:48 -07:00
Jessica Paquette	e7f501b5e7	[GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width Also add a target hook which allows us to get around custom legalization on AArch64. Differential Revision: https://reviews.llvm.org/D99283	2021-06-01 10:56:17 -07:00
Guozhi Wei	1b748faf2b	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-06-01 10:31:30 -07:00
Sanjay Patel	1b14f3951a	[SDAG] add helper function for sext-of-setcc folds; NFC Try to make this easier to read as noted in D103280	2021-06-01 08:07:17 -04:00
Arthur Eubanks	372237487e	[OpaquePtr] Remove some uses of PointerType::getElementType()	2021-05-31 16:11:25 -07:00
Arthur Eubanks	2c3afa3237	[OpaquePtr] Clean up some uses of Type::getPointerElementType() These depend on pointee types.	2021-05-31 09:54:57 -07:00
Arthur Eubanks	8815ce03e8	Remove "Rewrite Symbols" from codegen pipeline It breaks up the function pass manager in the codegen pipeline. With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use. Add a check that we only have one function pass manager in the codegen pipeline. Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline. addr-label.ll crashes on ARM due to this change. This is because a ARMConstantPoolConstant containing a BasicBlock to represent a blockaddress may hold an invalid pointer to a BasicBlock if the blockaddress is invalidated by its BasicBlock getting removed. In that case all referencing blockaddresses are RAUW a constant int. Making ARMConstantPoolConstant::CVal a WeakVH fixes the crash, but I'm not sure that's the right fix. As a workaround, create a barrier right before ISel so that IR optimizations can't happen while a ARMConstantPoolConstant has been created. Reviewed By: rnk, MaskRay, compnerd Differential Revision: https://reviews.llvm.org/D99707	2021-05-31 08:32:36 -07:00
Sanjay Patel	63fe4cb082	[SDAG] add check to sext-of-setcc fold to bypass changing a legal op I accidentaly pushed a draft of D103280 that was discussed during the review, but it was not supposed to be the final version. Rather than revert and recommit, I'm updating the existing code. This way we have a record of the codegen diff that would result if we decide to remove this predicate in the future.	2021-05-31 08:58:11 -04:00
Sanjay Patel	434c8e013a	[SDAG] try harder to fold casts into vector compare sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) -- (when the zexts are free and a bunch of other conditions) We have a couple of similar folds to this already for vector selects, but this pattern slips through because it is only a setcc. The tests are based on the motivating case from: https://llvm.org/PR50055 ...but we need extra logic to get that example, so I've left that as a TODO for now. Differential Revision: https://reviews.llvm.org/D103280	2021-05-31 07:14:01 -04:00
Djordje Todorovic	dee85d47d9	[LiveDebugVariables] Stop trimming locations of non-inlined vars The D35953, D62650 and D73691 introduced trimming of variables locations in LiveDebugVariables pass, since there are some cases where after the virtregrewrite we have exploded number of DBG_VALUEs created for some inlined variables. As it looks, all problematic cases were regarding inlined variables, so it seems reasonable to stop trimming the location ranges for non-inlined variables. It has very good impact on the llvm-locstats report. Differential Revision: https://reviews.llvm.org/D102917	2021-05-31 02:59:19 -07:00
Florian Hahn	126f90b252	[DAGCombine] Poison-prove scalarizeExtractedVectorLoad. extractelement is poison if the index is out-of-bounds, so just scalarizing the load may introduce an out-of-bounds load, which is UB. To avoid introducing new UB, we can mask the index so it only contains valid indices. Fixes PR50382. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103077	2021-05-30 11:40:55 +01:00
Pengxuan Zheng	056733d019	[SafeStack] Use proper API to get stack guard Using the proper API automatically sets `__stack_chk_guard` to `dso_local` if `Reloc::Static`. This wasn't strictly necessary until recently when dso_local was no longer implied by `TargetMachine::shouldAssumeDSOLocal` for `__stack_chk_guard`. By using the proper API, we can avoid generating unnecessary GOT relocations. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D102646	2021-05-30 00:52:48 -07:00
Arthur Eubanks	71cca4f728	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" This reverts commit `1c7f32334d`. Some code still needs to properly set parameter ABI attributes, see D101806.	2021-05-29 23:08:15 -07:00
Arthur Eubanks	3a6f12f915	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `bc7d15c61d`. Dependent change is to be reverted.	2021-05-29 22:40:33 -07:00
LemonBoy	b577ec4956	[AtomicExpandPass][AArch64] Promote xchg with floating-point types to integer ones Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types. This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: `expand-atomicrmw-xchg-fp.ll` would previously instantiate intrinsics such as `llvm.aarch64.ldaxr.p0f32` that cannot be lowered. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103232	2021-05-29 08:57:27 +02:00
Eli Friedman	0b3b0a727a	[AArch64][RISCV] Make sure isel correctly honors failure orderings. If a cmpxchg specifies acquire or seq_cst on failure, make sure we generate code consistent with that ordering even if the success ordering is not acquire/seq_cst. At one point, it was ambiguous whether this sort of construct was valid, but the C++ standad and LLVM now accept arbitrary combinations of success/failure orderings. This doesn't address the corresponding issue in AtomicExpand. (This was reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .) Fixes https://bugs.llvm.org/show_bug.cgi?id=50512. Differential Revision: https://reviews.llvm.org/D103284	2021-05-28 12:47:40 -07:00
Craig Topper	2830d924b0	[VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names. Parameter positions seem like they should be unsigned. While there, make function names lowercase per coding standards. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103224	2021-05-28 11:28:47 -07:00
Craig Topper	d24d2447cd	[SelectionDAG] Fix typo in assert. NFC	2021-05-28 10:37:11 -07:00
Tim Northover	9ff2eb1ea5	SwiftTailCC: teach verifier musttail rules applicable to this CC. SwiftTailCC has a different set of requirements than the C calling convention for a tail call. The exact argument sequence doesn't have to match, but fewer ABI-affecting attributes are allowed. Also make sure the musttail diagnostic triggers if a musttail call isn't actually a tail call.	2021-05-28 11:12:00 +01:00
Amara Emerson	59a4ee9728	[AArch64][GlobalISel] Legalize oversize G_EXTRACT_VECTOR_ELT sources. Also changes the fewerElements helper to use the lookthrough constant helper instead of m_ICst, since m_ICst doesn't look through extends. Differential Revision: https://reviews.llvm.org/D103227	2021-05-27 23:52:24 -07:00
Matt Arsenault	e892705d74	GlobalISel: Do not change register types in lowerLoad Adjusting the load register type is a widenScalar type action, not a lowering. lowerLoad should be reserved for operations that change the memory access size, such as unaligned load decomposition. With this trying to adjust the register type, it was hard to avoid infinite loops in the legalizer. Adds a bandaid to avoid regressing a few AArch64 tests, but I'm not sure what the exact condition is and there's probably a cleaner way to do this. For AMDGPU this regresses handling of some cases for unaligned loads, but the way this is currently working is a pretty ugly hack.	2021-05-27 11:49:37 -04:00
Nico Weber	192b4141f0	Revert "Emit correct location lists with basic block sections." Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085 This reverts commit `caae570978` and follow-up commit `1546c52d97`.	2021-05-27 11:42:04 -04:00
Matt Arsenault	808dc6f866	VirtRegMap: Preserve LiveDebugVariables This avoids recomputing it between regalloc runs when allocation is split, and also avoids a debug info test regression.	2021-05-27 10:40:14 -04:00
Fraser Cormack	5a80dc4988	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Fraser Cormack	b7101e218c	[DAGCombine][RISCV] Don't try to trunc-store combined vector stores DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173	2021-05-27 14:16:32 +01:00
Fraser Cormack	772b58a641	[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs This patch extends the cases in which the legalizer is able to express VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between boolean vector types, the mask itself is an all-ones or all-ones value of the operand type, so a 0/1 boolean type behaves identically to a 0/-1 type. This greatly helps RISC-V which relies on expansion for these nodes. It also allows scalable-vector bool VSELECTs to use the default expansion, where before it would crash in SelectionDAG::UnrollVectorOp. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103147	2021-05-27 10:08:57 +01:00
Amara Emerson	9f39ba13b5	[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR. Thhis is a port from the DAG legalization. We're still missing some of the canonicalizations of shuffles but it's a start. Differential Revision: https://reviews.llvm.org/D102828	2021-05-27 00:28:38 -07:00
Jessica Paquette	324af79dbc	[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls There were a bunch of lost debug location remarks that show up when legalizing tail calls on AArch64. This would happen because we drop the return in the block where we emit the tail call. So, we end up dropping the debug location, which makes the LostDebugLocObserver report a missing debug location. Although it's true that we lose these debug locations, this isn't a particularly useful remark. We expect to drop these debug locations when emitting tail calls. Suppressing remarks in this case is preferable, since the amount of noise could hide actual debug location related bugs. To do this, I just plumbed the LostDebugLocObserver through the relevant LegalizerHelper functions. This is the only case I can think of where we need the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it in the LegalizerHelper proper and mucking around with the constructors, I figured it'd be cleanest to take the simplest path for now. This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at -Os. Differential Revision: https://reviews.llvm.org/D103128	2021-05-26 17:16:11 -07:00
Sriraman Tallam	caae570978	Emit correct location lists with basic block sections. This patch addresses multiple things: 1) It ensures that const_value is emitted when possible with basic block sections. 2) It emits location lists such that the labels are always within the section boundary. 3) It fixes a bug when the parameter is first used in a non-entry block which is in a different section from the entry block. Differential Revision: https://reviews.llvm.org/D85085	2021-05-26 17:12:31 -07:00
Jeremy Morse	8496fc2ec8	[DebugInstrRef][1/3] Track PHI values through register allocation This patch introduces "DBG_PHI" instructions, a marker of where a PHI instruction used to be, before PHI elimination. Under the instruction referencing model, we want to know where every value in the function is defined -- and a PHI, even if implicit, is such a place. Just like instruction numbers, we can use this to identify a value to be used as a variable value, but we don't need to know what instruction defines that value, for example: bb1: DBG_PHI $rax, 1 [... more insts ... ] bb2: DBG_INSTR_REF 1, 0, !1234, !DIExpression() This specifies that on entry to bb1, whatever value is in $rax is known as value number one -- and the later DBG_INSTR_REF marks the position where variable !1234 should take on value number one. PHI locations are stored in MachineFunction for the duration of the regalloc phase in the DebugPHIPositions map. The map is populated by PHIElimination, and then flushed back into the instruction stream by virtregrewriter. A small amount of maintenence is needed in LiveDebugVariables to account for registers being split, but only for individual positions, not for entire ranges of blocks. Differential Revision: https://reviews.llvm.org/D86812	2021-05-26 20:24:00 +01:00
Heejin Ahn	5dd86aadf0	[WebAssembly] Add TargetInstrInfo::getCalleeOperand DwarfDebug unconditionally assumes for all call instructions the 0th operand is the callee operand, which seems to be true for other targets, but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand` method whose default implementation returns `getOperand(0)` and makes WebAssembly overrides it to use its own utility method to get the callee operand. This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was uncovered by this CL. Reviewed By: dschuff, djtodoro Differential Revision: https://reviews.llvm.org/D102978	2021-05-26 11:43:59 -07:00
Jonas Paulsson	d058262b14	[SystemZ] Support i128 inline asm operands. Support virtual, physical and tied i128 register operands in inline assembly. i128 is on SystemZ not really supported and is not a legal type and generally such a value will be split into two i64 parts. There are however some instructions that require a pair of two GPR64 registers contained in the GR128 bit reg class, which is untyped. For inline assmebly operands, it proved to be very cumbersome to first follow the general behavior of splitting an i128 operand into two parts and then later rebuild the INLINEASM MI to have one GR128 register. Instead, some minor common code changes were made to SelectionDAGBUilder to only create one GR128 register part to begin with. In particular: - getNumRegisters() now has an optional parameter "RegisterVT" which is passed by AddInlineAsmOperands() and GetRegistersForValue(). - The bitcasting in GetRegistersForValue is not performed if RegVT is Untyped. - The RC for a tied use in AddInlineAsmOperands() is now computed either from the tied def (virtual register), or by getMinimalPhysRegClass() (physical register). - InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register class (DstRC) can also be computed for an illegal type. In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() have been implemented to handle i128 operands. Differential Revision: https://reviews.llvm.org/D100788 Review: Ulrich Weigand	2021-05-26 10:08:32 -05:00
Tomas Matheson	e79e8041c5	[MC][NFCI] Factor out ELF section unique ID calculation Precursor to D100944. The logic for determining the unique ID had become quite difficult to reason about, so I have factored this out into a separate function. Differential Revision: https://reviews.llvm.org/D102336	2021-05-26 11:51:29 +01:00
Michael Liao	c9dd29925f	[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics. - When memory intrinsics, such as memcpy, the attached scoped AA metadata is not passed down to the backend. As a result, the backend cannot schedule relevant memory operations around them following that hint. In this patch, SelectionDAG is enhanced to propagate that metadata (scoped AA only) when they are lowered into loads and stores. Differential Revision: https://reviews.llvm.org/D102215	2021-05-25 14:42:26 -04:00
Benjamin Kramer	6359842bc0	[GlobalISel] Silence unused variable warning in Release builds. NFC.	2021-05-25 10:55:29 +02:00
Amara Emerson	ff30436dc5	[GlobalISel] Fix MachineIRBuilder not using the DstOp argument for G_SHUFFLE_VECTOR.	2021-05-25 00:43:26 -07:00
Christudasan Devadasan	90d784053f	AMDGPU/GlobalISel: Legalize G_[SU]DIVREM instructions Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D100726	2021-05-25 10:51:07 +05:30
Jon Roelofs	095e91c973	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Re-landing now that the crasher this patch previously uncovered has been fixed in: https://reviews.llvm.org/D102935 Differential revision: https://reviews.llvm.org/D102452	2021-05-24 10:10:44 -07:00
David Green	543406a69b	[ARM] Allow findLoopPreheader to return headers with multiple loop successors The findLoopPreheader function will currently not find a preheader if it branches to multiple different loop headers. This patch adds an option to relax that, allowing ARMLowOverheadLoops to process more loops successfully. This helps with WhileLoopStart setup instructions that can branch/fallthrough to the low overhead loop and to branch to a separate loop from the same preheader (but I don't believe it is possible for both loops to be low overhead loops). Differential Revision: https://reviews.llvm.org/D102747	2021-05-24 12:22:15 +01:00
Philipp Krones	c2f819af73	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921	2021-05-23 14:15:23 -07:00
LemonBoy	fd5cc41818	[SelectionDAG] Fix argument copy elision with irregular types D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D102153	2021-05-22 09:43:37 +02:00
Nick Desaulniers	033138ea45	[IR] make stack-protector-guard-* flags into module attrs D88631 added initial support for: - -mstack-protector-guard= - -mstack-protector-guard-reg= - -mstack-protector-guard-offset= flags, and D100919 extended these to AArch64. Unfortunately, these flags aren't retained for LTO. Make them module attributes rather than TargetOptions. Link: https://github.com/ClangBuiltLinux/linux/issues/1378 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D102742	2021-05-21 15:53:30 -07:00
Matt Arsenault	7521fcd269	AMDGPU/GlobalISel: Add subtarget to a test SelectionDAG forces us to have a weird ABI for 16-bit values without legal 16-bit operations, but currently GlobalISel bypasses this and sometimes ends up using the gfx8+ ABI in some contexts. Make sure we're testing the normal ABI to avoid a test change in a future patch.	2021-05-21 23:57:38 +09:00
Stephen Tozer	36ec97f76a	3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reapplies `c0f3dfb9`, which was reverted following the discovery of crashes on linux kernel and chromium builds - these issues have since been fixed, allowing this patch to re-land. This reverts commit `4397b7095d`.	2021-05-21 11:06:20 +01:00
Christudasan Devadasan	ab60e361c2	GlobalISel: Help reduce operation width for instruction with two results. The function `reduceOperationWidth` helps to legalize a vector operation either by narrowing its type or by scalarizing the operation itself. It currently supports instructions with one result. This patch, in addition allows the same for instructions with two results (for instance, G_SDIVREM). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D100725	2021-05-21 10:34:18 +05:30
Serge Pavlov	c162f086ba	[APFloat] convertToDouble/Float can work on shorter types Previously APFloat::convertToDouble may be called only for APFloats that were built using double semantics. Other semantics like single precision were not allowed although corresponding numbers could be converted to double without loss of precision. The similar restriction applied to APFloat::convertToFloat. With this change any APFloat that can be precisely represented by double can be handled with convertToDouble. Behavior of convertToFloat was updated similarly. It make the conversion operations more convenient and adds support for formats like half and bfloat. Differential Revision: https://reviews.llvm.org/D102671	2021-05-21 11:02:51 +07:00
Jessica Clarke	e10958c807	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Re-landed again after D102819 fixed PowerPC to correctly zero-extend all of its atomics as it claimed to do, since the combination of that bug and this optimisation caused buildbot regressions. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-20 20:34:23 +01:00
Jon Roelofs	0af3105b64	Revert "[Remarks] Add analysis remarks for memset/memcpy/memmove lengths" This reverts commit `4bf69fb52b`. This broke spec2k6/403.gcc under -global-isel. Details to follow once I've reduced the problem.	2021-05-20 12:19:16 -07:00
Fraser Cormack	26bd2250c1	[RISCV] Ensure shuffle splat operands are type-legal The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a type-legal splat value as it may implicitly extract a vector element from another shuffle. It is not permitted to introduce an illegal type when lowering shuffles. This patch addresses the crash by adding a boolean flag to `getSplatValue`, defaulting to false, which when set will ensure a type-legal return value. If it is unable to do that it will fail to return a splat value. I've been through the existing uses of `getSplatValue` in other targets and was unable to find a need or test cases showing a need to update their uses. In some cases, the call is made during `LegalizeVectorOps` which may still produce illegal scalar types. In other situations, the illegally-typed splat value may be quickly patched up to a legal type (such as any-extending the returned `extract_vector_elt` up to a legal type) before `LegalizeDAG` notices. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102687	2021-05-20 18:00:03 +01:00
Stephen Tozer	cf725dde9c	[DebugInfo] Handle DIArgList in FastISel or GlobalIsel Currently, variadic dbg.values (i.e. those using a DIArgList as part of their location) are not handled properly by FastISel or GlobalISel, and will produce invalid DBG_VALUE instructions if they encounter them. This patch fixes this issue by emitting undef DBG_VALUE instructions for variadic dbg.values, so that no incorrect instruction is produced and any prior variable location is terminated. This is simply a quick-fix to prevent errors; a correct implementation should come later for these ISel pipelines to ensure that we do not drop debug information unnecessarily. Differential Revision: https://reviews.llvm.org/D102500	2021-05-20 17:37:28 +01:00
David Sherwood	a21bff0673	[CodeGen] Add support for widening the result of EXTRACT_SUBVECTOR When trying to return a type such as <vscale x 1 x i32> from a function we crash in DAGTypeLegalizer::WidenVecRes_EXTRACT_SUBVECTOR when attempting to get the fixed number of elements in the vector. For the simple case we are dealing with, i.e. extracting <vscale x 1 x i32> from index 0 of input vector <vscale x 4 x i32> we can simply rely upon existing code that just returns the input. Differential Revision: https://reviews.llvm.org/D102605	2021-05-20 12:27:08 +01:00
David Sherwood	d07d5c1b06	[CodeGen] Add support for widening INSERT_SUBVECTOR operands When attempting to return something like a <vscale x 1 x i32> type from a function we end up trying to widen the vector by inserting a <vscale x 1 x i32> subvector into an undefined <vscale x 4 x i32> vector. However, during legalisation we then attempt to widen the INSERT_SUBVECTOR operands and hit an error in WidenVectorOperand. This patch adds a new WidenVecOp_INSERT_SUBVECTOR function that currently only supports inserting subvectors into undefined vectors. Differential Revision: https://reviews.llvm.org/D102501	2021-05-20 10:37:03 +01:00
Amara Emerson	57ea5d4f48	[GlobalISel] Fix div+rem -> divrem combine causing use-def violation.	2021-05-19 23:13:41 -07:00
Jon Roelofs	4bf69fb52b	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Differential revision: https://reviews.llvm.org/D102452	2021-05-19 15:09:18 -07:00
Jessica Paquette	84ae1cf8ed	Recommit "[GlobalISel] Simplify G_ICMP to true/false when the result is known" Add missing REQUIRES line to prelegalizer-combiner-icmp-to-true-false-known-bits.	2021-05-19 09:29:19 -07:00
Simon Moll	66963bf381	[VP] make getFunctionalOpcode return an Optional The operation of some VP intrinsics do/will not map to regular instruction opcodes. Returning 'None' seems more intuitive here than 'Instruction::Call'. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102778	2021-05-19 17:08:34 +02:00
Anirudh Prasad	f076da66b9	[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 1 - This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html \| main RFC here ]]) - This patch in particular introduces HLASM Parser support for Z machine instructions. - The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate. - The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well. The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf \| HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format): ``` <TokA><spaces.><TokB><spaces.><TokC><spaces.*><TokD> ``` 1. TokA is referred to as the Name Entry. This token is optional 2. TokB is referred to as the Operation Entry. This token is mandatory. 3. TokC is referred to as the Operand Entry. This token is mandatory 4. TokD is referred to as the Remarks Entry. This token is optional - If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on. - If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field - TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error. - TokD if provided is taken as is, and emitted as a comment. The following additional approach was examined, but not taken: - Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage. Testing: - This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on. - Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D98276	2021-05-19 11:05:30 -04:00
Simon Pilgrim	707fc2e2f2	Revert rG528bc10e95d5f9d6a338f9bab5e91d7265d1cf05 : "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB" Reports on D101970 indicate this is causing failures on multi-stage compiles.	2021-05-19 15:01:20 +01:00
Nico Weber	52a7797626	Revert "[GlobalISel] Simplify G_ICMP to true/false when the result is known" This reverts commit `892497c806`. Breaks tests, see comments on https://reviews.llvm.org/D102542	2021-05-19 09:02:27 -04:00
Sanjay Patel	6025663578	[SDAG] propagate FMF from target-specific IR intrinsics This is a step towards relying more on node-level FMF rather than function-wide or target settings. I think it was just an oversight that we didn't get this path in D87361 or follow-on patches. The lack of FMF propagation is blocking D90901 from converting tests to IR-level FMF. We can't do much more than this currently because we also fail to propagate flags from x86-specific node to generic FMA node. That would be another patch, so the test just verifies that we can transfer from IR to initial SDAG node. Differential Revision: https://reviews.llvm.org/D102725	2021-05-19 07:50:50 -04:00
Tim Northover	c1dc267258	MachineBasicBlock: add liveout iterator aware of which liveins are defined by the runtime. Using this in RegAlloc fast reduces register pressure, and in some cases allows x86 code to compile that wouldn't before.	2021-05-19 11:00:24 +01:00
Rong Xu	60a097e511	Fix sanitizer test errors from commit `886629a8` Explictly handle the empty string in the Hash calculation.	2021-05-18 22:46:51 -07:00
Guozhi Wei	528bc10e95	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-05-18 18:02:36 -07:00
Rong Xu	a32e39a75b	Fix a buildbot failure from commit `886629a8`	2021-05-18 16:53:34 -07:00
Rong Xu	886629a8c9	[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This patch implements first part of Flow Sensitive SampleFDO (FSAFDO). It has the following changes: (1) disable current discriminator encoding scheme, (2) new hierarchical discriminator for FSAFDO. For this patch, option "-enable-fs-discriminator=true" turns on the new functionality. Option "-enable-fs-discriminator=false" (the default) keeps the current SampleFDO behavior. When the fs-discriminator is enabled, we insert a flag variable, namely, llvm_fs_discriminator, to the object. This symbol will checked by create_llvm_prof tool, and used to generate a profile with FS-AFDO discriminators enabled. If this happens, for an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. Differential Revision: https://reviews.llvm.org/D102246	2021-05-18 16:23:43 -07:00
Arthur Eubanks	bc7d15c61d	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. This is a reland after an MSan fix in D102667. Differential Revision: https://reviews.llvm.org/D101713	2021-05-18 14:30:22 -07:00
Arthur Eubanks	1c7f32334d	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. This is a reland after fixing MSan issues in D102667. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-05-18 14:30:22 -07:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Jessica Paquette	892497c806	[GlobalISel] Simplify G_ICMP to true/false when the result is known Use existing KnownBits helpers from KnownBits.h to simplify G_ICMPs. E.g. x == x -> true x != x -> false load(x) > 1 -> true (when the load is known to be greater than 1) And so on. Differential Revision: https://reviews.llvm.org/D102542	2021-05-18 09:26:41 -07:00
David Green	b3d38327b2	[RDA] Fix printing of regs / reg units. NFC It was printing RegUnits as Regs, leading to much confusion in the debug logs.	2021-05-18 08:07:30 +01:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Serguei Katkov	57c660f374	[Statepoint Lowering] Cleanup: remove unused option statepoint-always-spill-base.	2021-05-18 12:15:15 +07:00
Nick Desaulniers	0f41778919	[AArch64] Support customizing stack protector guard Follow up to D88631 but for aarch64; the Linux kernel uses the command line flags: 1. -mstack-protector-guard=sysreg 2. -mstack-protector-guard-reg=sp_el0 3. -mstack-protector-guard-offset=0 to use the system register sp_el0 for the stack canary, enabling the kernel to have a unique stack canary per task (like a thread, but not limited to userspace as the kernel can preempt itself). Address pr/47341 for aarch64. Fixes: https://github.com/ClangBuiltLinux/linux/issues/289 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen Differential Revision: https://reviews.llvm.org/D100919	2021-05-17 11:49:22 -07:00
Simon Pilgrim	c29522d648	[TargetLowering] prepareUREMEqFold/prepareSREMEqFold - account for non legal shift types Ensure we tell getShiftAmountTy that we're working with pre-legalized types to prevent cases where the (legalized) shift type can no longer handle the (non-legalized) type width. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366	2021-05-17 11:03:27 +01:00
Tim Northover	82a0e808bb	IR/AArch64/X86: add "swifttailcc" calling convention. Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean "tailcc", but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of "swiftcc" and "tailcc". Support is added for AArch64 and X86 for now.	2021-05-17 10:48:34 +01:00
Fraser Cormack	85e31eddf2	[DAGCombiner] Relax an assertion to an early return The select-of-constants transform was asserting that its constant vector inputs did not implicitly truncate their input without that as an explicit precondition to the function. This patch relaxes that assertion into an early return to skip the optimization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102393	2021-05-17 09:15:55 +01:00
Arthur Eubanks	341902672c	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" This reverts commit `16748bd2fb`. Causes https://crbug.com/1209013	2021-05-16 22:02:10 -07:00
Arthur Eubanks	7647cb14dc	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `85af8a8c1b`.	2021-05-16 22:00:54 -07:00
Pan, Tao	976a3e5f61	[SelectionDAG] Make fast and linearize visible by clang -pre-RA-sched ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast object file isn't linked into clang executable file as no symbol is referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp, then the ScheduleDAGFast object file will be linked into clang executable file. The static RegisterScheduler will register scheduler fast and linearize at clang boot time. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D101601	2021-05-17 11:25:15 +08:00
David Green	dd5c52029d	[CPG][ARM] Optimize towards branch on zero in codegenprepare This adds a simple fold into codegenprepare that converts comparison of branches towards comparison with zero if possible. For example: %c = icmp ult %x, 8 br %c, bla, blb %tc = lshr %x, 3 becomes %tc = lshr %x, 3 %c = icmp eq %tc, 0 br %c, bla, blb As a first order approximation, this can reduce the number of instructions needed to perform the branch as the shift is (often) needed anyway. At the moment this does not effect very much, as llvm tends to prefer the opposite form. But it can protect against regressions from commits like rG9423f78240a2. Simple cases of Add and Sub are added along with Shift, equally as the comparison to zero can often be folded with cpsr flags. Differential Revision: https://reviews.llvm.org/D101778	2021-05-16 17:54:06 +01:00
Pengxuan Zheng	c9b36a041f	Support GCC's -fstack-usage flag This patch adds support for GCC's -fstack-usage flag. With this flag, a stack usage file (i.e., .su file) is generated for each input source file. The format of the stack usage file is also similar to what is used by GCC. For each function defined in the source file, a line with the following information is produced in the .su file. <source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic> "Static" means that the function's frame size is static and the size info is an accurate reflection of the frame size. While "dynamic" means the function's frame size can only be determined at run-time because the function manipulates the stack dynamically (e.g., due to variable size objects). The size info only reflects the size of the fixed size frame objects in this case and therefore is not a reliable measure of the total frame size. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100509	2021-05-15 10:22:49 -07:00
Simon Pilgrim	c5fe383990	IfConverter::MeetIfcvtSizeLimit - Fix uninitialized variable warnings. NFCI. Ensure the duplication instruction counts are initialized to zero (even though they aren't used) to silence static analysis warnings.	2021-05-15 14:51:54 +01:00
Nikita Popov	fb9ed1979a	[IR] Add BasicBlock::isEntryBlock() (NFC) This is a recurring and somewhat awkward pattern. Add a helper method for it.	2021-05-15 12:41:58 +02:00
Amara Emerson	80c534a8f9	[GlobalISel][CallLowering] Fix crash when handling a v3s32 type that's being passed as v2s64.	2021-05-14 16:30:51 -07:00
Sanjay Patel	9dfd7f9b67	[SDAG] reduce code duplication for extend_vec_inreg combines; NFC These are identical so far, and I was looking at adding a fold for a pattern with scalar_to_vector which would also nd up duplicated.	2021-05-14 08:29:57 -04:00
Tim Northover	ea0eec69f1	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
David Spickett	2db090a2eb	[llvm][AsmPrinter] Restore source location to register clobber warning Since `5de2d189e6` this particular warning hasn't had the location of the source file containing the inline assembly. Fix this by reporting via LLVMContext. Which means that we no longer have the "instantiated into assembly here" lines but they were going to point to the start of the inline asm string anyway. This message is already tested via IR in llvm. However we won't have the required location info there so I've added a C file test in clang to cover it. (though strictly, this is testing llvm code) Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102244	2021-05-14 08:22:57 +00:00
Chen Zheng	61484762e9	[Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102207	2021-05-13 21:15:06 -04:00
Chen Zheng	75f3beeedf	[Debug-Info] make DIE attributes generation under strict DWARF control Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101024	2021-05-13 20:34:07 -04:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Max Kazantsev	d8b37de8a4	[GC][NFC] Move GCStrategy from CodeGen to IR We want it to be available in analyzes so that we could use the CodeGen notion in middle-end passes (for example, to check if a GC may free some particular pointer). This is a preparatory patch that simply moves the files around. Note: if this causes some build issues, this patch must just be reverted. Differential Revision: https://reviews.llvm.org/D100557 Reviewed By: reames	2021-05-13 12:31:59 +07:00
Chen Zheng	a0ca4c46ca	[Debug-Info] add -gstrict-dwarf support in backend Reviewed By: dblaikie, probinson Differential Revision: https://reviews.llvm.org/D100826	2021-05-12 23:00:52 -04:00
Sam Clegg	3041b16f73	[WebAssembly] Add TLS data segment flag: WASM_SEG_FLAG_TLS Previously the linker was relying solely on the name of the segment to imply TLS. Differential Revision: https://reviews.llvm.org/D102202	2021-05-12 13:31:02 -07:00
Fraser Cormack	c5ec00e62b	[TargetLowering] Improve legalization of scalable vector types This patch extends the vector type-conversion and legalization capabilities of scalable vector types. Firstly, `vscale x 1` types now behave more like the corresponding `vscale x 2+` types. This enables the integer promotion legalization of extended scalable types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`. These `vscale x 1` types are also now better handled by `getVectorTypeBreakdown`, where what looks like older handling for 1-element fixed-length vector types was spuriously updated to include scalable types. Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR` to insert the smaller scalable vector "value" type into the wider scalable vector "part" type. This allows AArch64 to pass and return `vscale x 1` types by value by widening. There are still cases where we are unable to legalize `vscale x 1` types, such as where expansion would require splitting the vector in two. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102073	2021-05-12 16:33:07 +01:00
Stefan Pintilie	8d37411e48	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This reverts commit `6c80361b84`. Breaks PowerPC Big Endian buildbots.	2021-05-12 09:46:18 -05:00
Hendrik Greving	762ac725bf	[DAGCombiner] Fix DAG combine store elimination, different address space. Fixes a bug in the DAG combiner that eliminates the stores because it missed to inspect the address space of the pointers. %v = load %ptr_as1 // no chain side effect store %v, %ptr_as2 As well as store %v, %ptr_as1 store %v, %ptr_as2 Fixes a test for above in X86. Differential Revision: https://reviews.llvm.org/D102096	2021-05-12 07:14:22 -07:00
Jay Foad	a383d325f6	[TargetRegisterInfo] Speed up getAllocatableSet. NFCI. MachineRegisterInfo caches the reserved register set that is computed by by TargetRegisterInfo::getReservedRegs, so call into MRI to get the reserved regs to avoid recomputing them. In particular this speeds up AMDGPU's SIFormMemoryClauses pass because AMDGPU has a particularly complicated reserved set that is expensive to compute. Differential Revision: https://reviews.llvm.org/D102318	2021-05-12 14:09:05 +01:00
Stephen Tozer	fdb055f4f1	Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" Previous crashes caused by this patch were the result of machine subregisters being incorrectly handled in updateDbgUsersToReg; this has been fixed by using RegUnits to determine overlapping registers, instead of using the register values directly. Differential Revision: https://reviews.llvm.org/D101523 This reverts commit `7ca26c5fa2`.	2021-05-12 10:19:57 +01:00
Matt Arsenault	6f5ddf6731	GlobalISel: Don't hardcode varargs=false in resultsCompatible	2021-05-11 20:22:06 -04:00
Matt Arsenault	24e2e5df0e	GlobalISel: Split ValueHandler into assignment and emission classes Currently the ValueHandler handles both selecting the type and location for arguments, as well as inserting instructions needed to handle them. Split this so that the determination of the argument handling is independent of the function state. Currently the checks for tail call compatibility do not follow the full assignment logic, so it misses cases where arguments require nontrivial legalization. This should help avoid targets ending up in a buggy state where the argument evaluation may change in different contexts.	2021-05-11 19:50:12 -04:00
Matt Arsenault	bce3cca488	CodeGen: Fix null dereference before null check	2021-05-11 09:07:32 -04:00
Denis Antrushin	df47368d40	[RegAllocFast] properly handle STATEPOINT instruction. STATEPOINT is a fancy and complex pseudo instruction which has both tied defs and regmask operand. Basic FastRA algorithm is as follows: 1. Mark registers used by defs as free 2. If instruction has regmask operand displace clobbered registers according to regmask. 3. Assign registers for use operands. In case of tied defs step 1 is replaced with allocation of registers for them. But regmask is still processed, which may displace already allocated registers. As a result, tied use and def will get assigned to different registers. This patch makes FastRA to process instruction's RegMask (if any) when checking for physical registers interference. That way tied operands won't get registers clobbered by regmask. Reviewed By: arsenm, skatkov Differential Revision: https://reviews.llvm.org/D99284	2021-05-11 17:27:00 +07:00
Sam Clegg	3b8d2be527	Reland: "[lld][WebAssembly] Initial support merging string data" This change was originally landed in: `5000a1b4b9` It was reverted in: `061e071d8c` This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 16:03:38 -07:00
Nico Weber	061e071d8c	Revert "[lld][WebAssembly] Initial support merging string data" This reverts commit `5000a1b4b9`. Breaks tests, see https://reviews.llvm.org/D97657#2749151 Easily repros locally with `ninja check-llvm-mc-webassembly`.	2021-05-10 18:28:28 -04:00
Sam Clegg	5000a1b4b9	[lld][WebAssembly] Initial support merging string data This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 13:15:12 -07:00
Arthur Eubanks	85af8a8c1b	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. Differential Revision: https://reviews.llvm.org/D101713	2021-05-10 13:05:15 -07:00
Arthur Eubanks	16748bd2fb	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-05-10 12:35:11 -07:00
Amara Emerson	dc75499998	[GlobalISel][IRTranslator] Fix bit-test lowering dropping phi edges. For contiguous ranges we drop the last bit-test case but in doing so we skip adding the new MBB PHI edges to the list of replacement PHI edges, and as a result we incorrectly omit them in the G_PHI in finishPendingPhis(). Was found when bootstrapping clang with -O3 and GlobalISel enabled on Apple Silicon.	2021-05-10 11:59:31 -07:00
Harald van Dijk	b0ef2070bc	[X86] Fix position-independent TType encoding The logic for x86_64 position-independent TType encodings was backwards, using 8 bytes where 4 were wanted and 4 where 8 were wanted. For regular x86_64, this was mostly harmless, exception tables are allowed to use 8-byte encodings even when it is not needed. For the large code model, and for X32, however, the generated exception tables were wrong. For the large code model, we cannot assume that the address will fit in 4 bytes. For X32, we cannot use 64-bit relocations. Fixes PR50148. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102132	2021-05-10 17:04:33 +01:00
Bradley Smith	635164b95a	[AArch64][SVE] Improve SVE codegen for fixed length BITCAST Expanding a fixed length operation involves wrapping the operation in an insert/extract subvector pair, as such, when this is done to bitcast we end up with an extract_subvector of a bitcast. DAGCombine tries to convert this into a bitcast of an extract_subvector which restores the initial fixed length bitcast, causing an infinite loop of legalization. As part of this patch, we must make sure the above DAGCombine does not trigger after legalization if the created bitcast would not be legal. Differential Revision: https://reviews.llvm.org/D101990	2021-05-10 14:43:53 +01:00
Fraser Cormack	3212a08a8c	[Constant] Allow ConstantAggregateZero a scalable element count A ConstantAggregateZero may be created from a scalable vector type. However, it still assumed fixed number of elements when queried for them. This patch changes ConstantAggregateZero to correctly report its element count. This change fixes a couple of issues. Firstly, it fixes a crash in Constant::getUniqueValue when called on a scalable-vector zeroinitializer constant. Secondly, it fixes a latent bug in GlobalISel's IRTranslator in which translating a scalable-vector zeroinitializer would hit the assertion in ConstantAggregateZero::getNumElements when casting to a FixedVectorType, rather than reporting an error more gracefully. This is currently hypothetical as the IRTranslator has deeper issues preventing the use of scalable vector types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102082	2021-05-10 13:51:53 +01:00
Momchil Velikov	f3139b20a0	[GlobalISel] Fix wrong invocation of `getParamStackAlign` (NFC) The function template `CallLowering::setArgFlags` is invoked both for arguments and return values. In the latter case, it calls `getParamStackAlign` with argument index `~0u`. Nothing wrong happens now, as the argument is safely incremented back to 0 inside `getParamStackAlign` (the type is `unsigned`), but in principle it's fragile and may become incorrect. Differential Revision: https://reviews.llvm.org/D102004	2021-05-10 12:16:33 +01:00
Fraser Cormack	6db0cedd23	[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type. We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102063	2021-05-10 08:22:35 +01:00
Xiang1 Zhang	d4bdeca576	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-05-08 14:21:11 +08:00
Xiang1 Zhang	bebafe01a7	Revert "[X86] Support AMX fast register allocation" This reverts commit `77e2e5e07d`.	2021-05-08 13:43:32 +08:00
Xiang1 Zhang	77e2e5e07d	[X86] Support AMX fast register allocation	2021-05-08 13:27:21 +08:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Amara Emerson	808bc11d9e	[GlobalISel] Don't form zero/sign extending loads for atomics. For importing patterns, we only support matching G_LOAD, not G_ZEXTLOAD or G_SEXTLOAD. Differential Revision: https://reviews.llvm.org/D101932	2021-05-07 16:41:48 -07:00
Arthur Eubanks	7ca26c5fa2	Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" This reverts commit `0791f968fe`. Causing crashes: https://crbug.com/1206764	2021-05-07 12:05:16 -07:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Fangrui Song	6a2850f3fc	[AArch64][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local Similar to X86 D73230 & `46788a21f9` With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions. For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT. Note: the 'S' inline assembly constraint refers to an absolute symbolic address or a label reference (D46745). Differential Revision: https://reviews.llvm.org/D101872	2021-05-07 09:44:26 -07:00
Stephen Tozer	7bc1dd1191	Reapply "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands" Reapply b623df3c, which was reverted while reverting a different patch with a breaking change. There are no underlying issues with this patch, so no changes have been made to the original patch. This reverts commit `b11e4c9907`.	2021-05-07 14:55:02 +01:00
Simon Pilgrim	c9d4b4173b	[CodeGen] Ensure UserValue::getDebugLoc() and UserLabel::getDebugLoc() consistently return a const reference NFCI. Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.	2021-05-07 14:48:23 +01:00
Simon Pilgrim	dd21c6b843	[DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI. Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.	2021-05-07 14:48:23 +01:00
Benjamin Kramer	6248d11190	Retire TargetRegisterInfo::getSpillAlignment getSpillAlign does the same thing.	2021-05-07 15:16:22 +02:00
Stephen Tozer	ce0c1f3ced	[DebugInfo] Fix crash when emitting an invalidated SDDbgValue This patch fixes a crash in the compiler that occurs when certain invalidated SDDbgValues are emitted. The cause of this was that we would attempt to check the liveness of the debug value's operands, which triggers an assert if any of those operands are invalid. This patch changes this check such that it only occurs if the SDDbgValue is valid; if not, the check is irrelevant anyway, so can be safely ignored. Differential Revision: https://reviews.llvm.org/D101540	2021-05-07 13:13:56 +01:00
Simon Pilgrim	280aa3415e	[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts. Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication. I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to). NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly. Differential Revision: https://reviews.llvm.org/D101987	2021-05-07 13:12:30 +01:00
Stephen Tozer	0791f968fe	[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands. Differential Revision: https://reviews.llvm.org/D101523	2021-05-07 11:47:50 +01:00
Guillaume Chatelet	e805b7c2d6	[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen Differential Revision: https://reviews.llvm.org/D102058	2021-05-07 10:22:41 +00:00
Sebastian Neubauer	98e5ede604	[AMDGPU] Serialize MFInfo::ScavengeFI Serialize ScavengeFI from SIMachineFunctionInfo into yaml. ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything. Differential Revision: https://reviews.llvm.org/D101367	2021-05-07 11:15:25 +02:00
Chen Zheng	9deb7eeaf7	[Debug-Info][NFC] add a wrapper for Die.addValue Add a new wrapper function addAttribute() for Die.addValue() function, so we can do some attributes control in one single interface. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101125	2021-05-07 07:24:09 +00:00
Amara Emerson	1ccebb18ef	[GlobalISel] Micro-optimize the conditional branch optimization. Convert a check into an assert and pass an MI instead of recomputing in the apply function.	2021-05-07 00:03:09 -07:00
Jessica Clarke	6c80361b84	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-06 04:01:20 +01:00
RamNalamothu	41f8b8e807	[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling This change enables emitting CFI unwind information for debugging purpose for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None. Currently generating CFI unwind information is entangled with supporting the exceptions, even when AsmPrinter explicitly recognizes that the unwind tables are being generated as debug information. In fact, the unwind information is not generated even if we specify --force-dwarf-frame-section, unless exceptions are enabled. The LIT test llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior. Enable this option for AMDGPU to prepare for future patches which add complete CFI support. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D78778	2021-05-06 04:53:45 +05:30
Matt Arsenault	fa0b93b5a0	GlobalISel: Use DAG call lowering infrastructure in a more compatible way Unfortunately the current call lowering code is built on top of the legacy MVT/DAG based code. However, GlobalISel was not using it the same way. In short, the DAG passes legalized types to the assignment function, and GlobalISel was passing the original raw type if it was simple. I do believe the DAG lowering is conceptually broken since it requires picking a type up front before knowing how/where the value will be passed. This ends up being a problem for AArch64, which wants to pass i1/i8/i16 values as a different size if passed on the stack or in registers. The argument type decision is split across 3 different places which is hard to follow. SelectionDAG builder uses getRegisterTypeForCallingConv to pick a legal type, tablegen gives the illusion of controlling the type, and the target may have additional hacks in the C++ part of the call lowering. AArch64 hacks around this by not using the standard AnalyzeFormalArguments and special casing i1/i8/i16 by looking at the underlying type of the original IR argument. I believe people have generally assumed the calling convention code is processing the original types, and I've discovered a number of dead paths in several targets. x86 actually relies on the opposite behavior from AArch64, and relies on x86_32 and x86_64 sharing calling convention code where the 64-bit cases implicitly do not work on x86_32 due to using the pre-legalized types. AMDGPU targets without legal i16/f16 have always used a broken ABI that promotes to i32/f32. GlobalISel accidentally fixed this to be the ABI we should have, but this fixes it so we're using the worse ABI that is compatible with the DAG. Ideally we would fix the DAG to match the old GlobalISel behavior, but I don't wish to fight that battle. A new native GlobalISel call lowering framework should let the target process the incoming types directly. CCValAssigns select a "ValVT" and "LocVT" but the meanings of these aren't entirely clear. Different targets don't use them consistently, even within their own call lowering code. My current belief is the intent was "ValVT" is supposed to be the legalized value type to use in the end, and and LocVT was supposed to be the ABI passed type (which is also legalized). With the default CCState::Analyze functions always passing the same type for these arguments, these only differ when the TableGen part of the lowering decide to promote the type from one legal type to another. AArch64's i1/i8/i16 hack ends up inverting the meanings of these values, so I had to add an additional hack to let the target interpret how large the argument memory is. Since targets don't consistently interpret ValVT and LocVT, this doesn't produce quite equivalent code to the initial DAG lowerings. I've opted to consistently interpret LocVT as the in-memory size for stack passed values, and ValVT as the register type to assign from that memory. We therefore produce extending loads directly out of the IRTranslator, whereas the DAG would emit regular loads of smaller values. This will also produce loads/stores that are wider than the argument value if the allocated stack slot is larger (and there will be undef padding bytes). If we had the optimizations to reduce load/stores based on truncated values, this wouldn't produce a different end result. Since ValVT/LocVT are more consistently interpreted, we now will emit more G_BITCASTS as requested by the CCAssignFn. For example AArch64 was directly assigning types to some physical vector registers which according to the tablegen spec should have been casted to a vector with a different element type. This also moves the responsibility for inserting G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the generic code, which is closer to how SelectionDAGBuilder works. I had to xfail an x86 test since I don't see a quick way to fix it right now (I filed bug 50035 for this). It's broken independently of this change, and only triggers since now we end up with more ands which hit the improperly handled selection pattern. I also observed that FP arguments that need promotion (e.g. f16 passed as f32) are broken, and use regular G_TRUNC and G_ANYEXT. TLDR; the current call lowering infrastructure is bad and nobody has ever understood how it chooses types.	2021-05-05 17:35:02 -04:00
Michael Kitzan	a11489ae3e	[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs - Move the code preventing CSE of `isConvergent` instrs into `ProcessBlockCSE` (from `isProfitableToCSE`) - Add comments explaining why `isConvergent` is used to prevent CSE of non-local instrs in MachineCSE and the new test	2021-05-05 14:22:03 -07:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Jessica Clarke	897d7bceb9	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This seems to have broken sanitizers, giving lots of Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed. failures across multiple targets (currently X86 and PowerPC). Reverting until I have a chance to reproduce and debug. This reverts commit `6e876f9ded`.	2021-05-05 17:02:05 +01:00
Jessica Clarke	6e876f9ded	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-05 16:34:45 +01:00
Vang Thao	a3d273c9ff	[GlobalISel] Fix buildZExtInReg creating new register. Fix a bug where buildZExtInReg will create and use a new register instead of using the register from parameter DstOp Res. Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D101871	2021-05-05 08:19:52 -07:00
Serguei Katkov	9f631d14c6	[GreedyRA] Add support for invoke statepoint with tied-defs. statepoint instruction uses tied-def registers to represent live gc value which is use and def at the same time on a call. At the same time invoke statepoint instruction is a last split point which can throw and jump to landing pad. As a result we have instructon which is last split point with tied-defs registers and we need to teach Greedy RA to work with it. The option -use-registers-for-gc-values-in-landing-pad controls whether statepoint lowering will generate tied-defs for invoke statepoint and is off by default now. To resolve all issues the following changes has been done. 1) Last Split point for invoke statepoint should be statepoint itself If statepoint has a def it is a relocated gc pointer and it should be available in landing pad. So we cannot split interval after statepoint at end of basic block. 2) Do not split interval on tied-def If end of interval for overlap utility is a use which has tied-def we should not split interval on this instruction due to in this case use and def may have different registers and it breaks tied-def property. 3) Take into account Last Split Point for enterIntvAtEnd If the use after Last Split Point is a def so it should be tied-def and we can take the def of the tied-use as ParentVNI and thus tied-use and tied-def will be live in resulting interval. 4) Handle the case when def is after LIP in InlineSpiller If def of LI is after last insertion point of basic block we cannot hoist in this BB. The example of such instruction is invoke statepoint where def represents the relocated live gc pointer. Invoke is a last insertion point and its def is located after it. In this case there is no place to insert spill and we bail out. 5) Fix removeBackCopies to account empty copies RegAssignMap cannot hold empty interval, so do not set stop to kill value if it produces empty interval. This can happen if we remove back-copy and right before that we have another back-copy. For example, for parent %0 we can get %1 = COPY %0 %2 = COPY %0 while we removing %2 we cannot set kill for %1 due to its empty. 6) Do not hoist copy to BB if its def is after LSP If the parent def is a LastSplitPoint or later we cannot hoist copy to this basic block because inserted copy (or re-materialization) will be located before the def. All parts have been reviewed separately as follows: https://reviews.llvm.org/D100747 https://reviews.llvm.org/D100748 https://reviews.llvm.org/D100750 https://reviews.llvm.org/D100927 https://reviews.llvm.org/D100945 https://reviews.llvm.org/D101028 Reviewers: reames, rnk, void, MatzeB, wmi, qcolombet Reviewed By: reames, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D101150	2021-05-05 11:13:35 +07:00
Fraser Cormack	6523ff6d47	[ValueTypes] Add MVTs for v256i16 and v256f16 This patch adds the two MVTs to fix a legalizer crash when using vector shuffles of <256 x i16> and <128 x i16> on RISC-V. The legalizer can't promote the operand of `v256i32 = any_extend_vector_inreg v128i16`. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D101769	2021-05-04 18:06:13 +01:00
Christudasan Devadasan	80c79035ef	DAG: Cleanup assertion in EmitFuncArgumentDbgValue Removing an assertion introduced with D68945. The patch was later reverted with `6531a78ac4`, but failed to remove this assertion. It causes a problem while trying to split a 64-bit argument into sub registers. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D101594	2021-05-04 21:48:58 +05:30
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
Tomas Matheson	9d86095ff8	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `753185031d`.	2021-05-03 21:48:20 +01:00
Tomas Matheson	753185031d	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as `3338290c18`. Reverted in `c7df6b1223`.	2021-05-03 20:25:15 +01:00
Paul Robinson	1d299252dd	[DebuggerTuning] Move a comment to a more useful place. The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.	2021-05-03 11:08:04 -07:00
Craig Topper	6430430958	[TableGen] Use sign rotated VBR for OPC_EmitInteger. This allows for a much more efficient encoding for small negative numbers by storing the sign bit first and negating the rest of the bits. This was already being used for OPC_CheckInteger. For every in tree target this affects, the table got smaller. R600GenDAGISel.inc saw the largest reduction of 7K. I did have to add a new opcode for StringIntegers used for register class ids and subregister indices since we don't have the integer value to encode. The enum name is emitted directly into the table. Previously assumed the enum would expand to a positive 7-bit number. We might be able to just shift that right by 1 and assume it is a positive 6 bit number, but that will need more investigation.	2021-05-02 12:40:44 -07:00
Nathan Chancellor	4397b7095d	Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This reverts commit `791930d740`, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. I observed breakage with the Linux kernel, as reported at https://reviews.llvm.org/D91722#2724321 Fixes exist at https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but they have not landed so to unbreak the tree for the weekend, revert this commit. Commit `b11e4c9907` ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"") only reverted one follow-up fix, not the original patch that broke the kernel. e	2021-04-30 20:23:21 -07:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Nick Desaulniers	b11e4c9907	Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands" This reverts commit b623df3c93983c4512aa54f2c706716bdf865a90, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. Breakages observed downstream reported in: https://reviews.llvm.org/D91722#2724321 Fixes exist in: https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but haven't landed yet going into the weekend.	2021-04-30 16:45:37 -07:00
Jon Roelofs	421569b244	[EarlyIfConversion] Avoid producing selects with identical operands This extends the early-ifcvt pass to avoid a few more cases where the resulting select instructions would have matching operands. Additionally, we now use TII to determine "sameness" of the operands so that as TII gets smarter, so too will ifcvt. The attached test case was bugpoint-reduced down from CINT2000/252.eon in the test-suite. See: https://clang.godbolt.org/z/WvnrcrGEn Differential Revision: https://reviews.llvm.org/D101508	2021-04-30 15:51:14 -07:00
Jon Roelofs	8be3af36f9	Revert "[EarlyIfConversion] Avoid producing selects with identical operands" This reverts commit `3d27b5d28a`. Broke one of the PPC tests, which I didn't see because I usually build with only the x86/AARch64 targets enabled... oops. https://lab.llvm.org/buildbot#builders/109/builds/13834 llvm/test/CodeGen/PowerPC/expand-foldable-isel.ll	2021-04-30 14:55:34 -07:00
Jon Roelofs	3d27b5d28a	[EarlyIfConversion] Avoid producing selects with identical operands This extends the early-ifcvt pass to avoid a few more cases where the resulting select instructions would have matching operands. Additionally, we now use TII to determine "sameness" of the operands so that as TII gets smarter, so too will ifcvt. The attached test case was bugpoint-reduced down from CINT2000/252.eon in the test-suite. See: https://clang.godbolt.org/z/WvnrcrGEn Differential Revision: https://reviews.llvm.org/D101508	2021-04-30 14:42:39 -07:00
Daniil Fukalov	3489c2d7b1	[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen, kparzysz Differential Revision: https://reviews.llvm.org/D101533	2021-04-30 22:51:51 +03:00
Tomas Matheson	c7df6b1223	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `3338290c18`. Broke expensive checks on debian.	2021-04-30 16:53:14 +01:00
Tomas Matheson	3338290c18	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164	2021-04-30 16:40:33 +01:00
Sidharth Baveja	70c433a184	[XCOFF][AIX] Add Global Variables Directly to TOC for 32 bit AIX Summary: This patch implements the backend implementation of adding global variables directly to the table of contents (TOC), rather than adding the address of the variable to the TOC. Currently, this patch will look for the "toc-data" attribute on symbols in the IR, and then add those symbols to the TOC. ATM, this is implemented for 32 bit AIX. Reviewers: sfertile Differential Revision: https://reviews.llvm.org/D101178	2021-04-30 14:48:02 +00:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Matt Arsenault	55a29c6b71	VirtRegMap: Support partially allocated virtual registers Don't assert if there are unassigned virtual registers. Maintain LiveIntervals by removing the RegUnits for allocated registers, since they should not longer be necessary. One part I find somewhat questionable is the special handling necessary for handleIdentityCopy. The LiveIntervals for the relevant regunits needs to be removed.	2021-04-29 21:51:05 -04:00
Matt Arsenault	1cf3d68f97	VirtRegMap: Add pass option to not clear virt regs In a future change it will be possible to run register allocation with a specific set of register classes, so some of the remaining virtual registers will still be meaningful.	2021-04-29 21:08:47 -04:00
Zequan Wu	cab48e2f0e	[CodeGen] don't emit addrsig symbol if it's used only by metadata Value only used by metadata can be removed from .addrsig table. This solves the undefined symbol error when enabling addrsig table on COFF LTO. Differential Revision: https://reviews.llvm.org/D101512	2021-04-29 15:39:30 -07:00
jasonliu	7049fbf960	[XCOFF] Handle the case when personality routine is an alias Summary: Personality routine could be an alias to another personality routine. Fix the situation when we compile the file that contains the personality routine and the file also have functions that need to refer to the personality routine. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101401	2021-04-29 22:03:30 +00:00
Amara Emerson	fa2340574c	[GlobalISel][Legalizer] Bump up a smallvector size that was found to be too small. NFC.	2021-04-29 14:41:34 -07:00
Amara Emerson	96ec6d91e4	[AArch64][GlobalISel] Simplify out of range rotate amount. Differential Revision: https://reviews.llvm.org/D101005	2021-04-29 14:05:58 -07:00
Sriraman Tallam	a64411916c	Basic block sections for functions with implicit-section-name attribute Functions can have section names set via #pragma or section attributes, basic block sections should be correctly named for such functions. With #pragma, the expectation is that all functions in that file are placed in the same section in the final binary. Basic block sections should be correctly named with the unique flag set so that the final binary has all the basic blocks of the function in that named section. This patch fixes the bug by calling getExplictSectionGlobal when implicit-section-name attribute is set to make sure the function's basic blocks get the correct section name. Differential Revision: https://reviews.llvm.org/D101311	2021-04-29 12:29:34 -07:00
Tim Northover	c1b7460b5b	Revert "RegAlloc: do not consider liveins to EH-pad successors as liveout." Some liveins can come from this block (e.g. any SSA value except the call), it's only the ones that produce `landingpad` values that can't and I didn't think it through properly.	2021-04-29 20:00:07 +01:00
Tim Northover	438a63e13b	RegAlloc: do not consider liveins to EH-pad successors as liveout. These registers get defined by the runtime, not the block being allocated, and treating them as preassigned in RegAllocFast adds extra pressure, sometimes enough to make the function unallocatable.	2021-04-29 19:34:49 +01:00
Benjamin Kramer	df323ba445	Revert "[X86] Support AMX fast register allocation" This reverts commit `3b8ec86fd5`. Revert "[X86] Refine AMX fast register allocation" This reverts commit `c3f95e9197`. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Craig Topper	0c330afdfa	[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32. This replaces D98479. This allows type legalization to form SPLAT_VECTOR_PARTS so we don't lose the splattedness when the scalar type is split. I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so we can continue using non-VL nodes for scalable vectors. I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR with other operations. Especially interesting is a splat BUILD_VECTOR of the extract_vector_elt which can become a splat shuffle, but won't if we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR or add visitSPLAT_VECTOR. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100803	2021-04-29 08:20:09 -07:00
Amara Emerson	2fa14d4700	Try to fix bots. We shouldn't be setting the entrybuilder's DL to a null one. This was causing a DILocation verifier error, the old code path didn't try to do this when building constants via the finishPendingPhis() method.	2021-04-29 03:51:10 -07:00
Amara Emerson	aa0b9200e8	[GlobalISel][IRTranslator] Move line zero DebugLoc creation to constant translation. NFC. This is a compile time optimization. DILocation:get() is expensive to call, and we were calling it to create a line zero debug loc for every instruction we translated. We only really need to do this just before we build constants in the entry block, so I moved this code there. This reduces the LLVM -O0 codegen time of sqlite3 IR by around 0.7% instructions executed and by about ~2% in CPU time. We can probably do better with a more involved change, since the reason we need to create one for each new constant is because we're using the debug scope and inlined-at loc. If we just use a single instruction's scope and drop the inlined-at, we can just cache these and have them be free.	2021-04-28 23:54:14 -07:00
Matt Arsenault	cea97fc0fc	GlobalISel: Relax verification of physical register copy types This was picking a concrete size for a physical register, and enforcing exact match on the virtual register's type size. Some targets add multiple types to a register class, and some are smaller than the full bit width. For example x86 adds f32 to 128-bit xmm registers, and AMDGPU adds i16/f16 to 32-bit registers. It might be better to represent these cases as a copy of the full register and an extraction of the subpart, but a lot of code assumes you can directly copy. This will help fix the current usage of the DAG calling convention infrastructure which is incompatible with how GlobalISel is now using it. The API is somewhat cumbersome here, but I just mirrored the existing functions, except now with LLTs (and allow returning null on failure, unlike the MVT version). I think the concept of selecting register classes based on type is flawed to begin with, but I'm trying to keep this compatible with the existing handling.	2021-04-28 08:45:41 -04:00
Benjamin Kramer	7e5682ee62	[ADT] Make TrackingStatistic's ctor constexpr This lets clang diagnose unused statistics, so remove them.	2021-04-28 12:00:17 +02:00
Stephen Tozer	b622df3c93	[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands This patch fixes a crash in LiveDebugVariables for inputs where a DBG_VALUE_LIST had 64 or more debug operands. This was triggering an assert, which was added under the assumption that only bad CodeGen would result in such a limit being hit, but relatively simple source files that result in these incredibly long debug values have been found, so this assert has been changed to a condition that drops the debug value if it is not met. Differential Revision: https://reviews.llvm.org/D101373	2021-04-28 10:39:02 +01:00
RamNalamothu	63cfab4f40	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-28 09:04:04 +05:30
Hongtao Yu	39ae5bf5c5	[CSSPGO] Fix an AV caused by a block that has only pseudo pseudo instructions. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D101415	2021-04-27 17:54:34 -07:00
Craig Topper	3067520bf4	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Nick Desaulniers	ea8416bf4d	[CodeGenOptions] make StackProtectorGuardOffset signed GCC supports negative values for -mstack-protector-guard-offset=, this should be a signed value. Pre-req to D100919. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101325	2021-04-27 10:12:58 -07:00
Petar Avramovic	0713c82b13	[GlobalISel]: Add a getConstantIntVRegVal utility Returns ConstantInt from G_CONSTANT instruction given its def register. Differential Revision: https://reviews.llvm.org/D99733	2021-04-27 10:52:07 +02:00
Chen Zheng	e5000eef81	[XCOFF] make .file directive have directory info The .file directive is changed to only have basename in D36018 for ELF. But on AIX, we require the .file directive to also contain the directory info. This aligns with other AIX compiler like XLC and is required by some AIX tool like DBX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D99785	2021-04-27 00:15:23 -04:00
Fangrui Song	e01c666b13	Revert D76519 "[NFC] Refactor how CFI section types are represented in AsmPrinter" This reverts commit `0ce723cb22`. D76519 was not quite NFC. If we see a CFISection::Debug function before a CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`. We should use .eh_frame instead. This scenario is untested.	2021-04-26 15:17:28 -07:00

... 2 3 4 5 6 ...

30883 Commits