llvm-project

Commit Graph

Author	SHA1	Message	Date
Guozhi Wei	fd75ad8662	[MBFIWrapper] Add a new function getBlockProfileCount MBFIWrapper keeps track of block frequencies of newly created blocks and modified blocks, modified block frequencies should also impact block profile count. This class doesn't provide interface getBlockProfileCount, users can only use the underlying MBFI to query profile count, the underlying MBFI doesn't know the modifications made in MBFIWrapper, so it either provides stale profile count for modified block or simply crashes on new blocks. So this patch add function getBlockProfileCount to class MBFIWrapper to handle new blocks or modified blocks. Differential Revision: https://reviews.llvm.org/D87802	2020-09-23 09:31:45 -07:00
Matt Arsenault	c463fd136e	GlobalISel: Fix truncating shift amount in trunc (shl) combine The shift amount type does not necessarily match the result type. This was inserting a trunc from s32 to s32, which asserted. Just preserve the original shift amount type which can be legalized later.	2020-09-23 09:07:50 -04:00
David Sherwood	e077367a28	[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the caller is requesting a scalar size, which cannot be scalable. This patch makes other similar functions requesting a scalar size consistent with that, thereby eliminating more than 1000 implicit TypeSize -> uint64_t casts. Differential revision: https://reviews.llvm.org/D87889	2020-09-23 09:20:08 +01:00
Fangrui Song	bee68b2956	[EHStreamer] Ensure CallSiteEntry::{BeginLabel,EndLabel} are non-null. NFC ... to simplify the code a bit. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D87999	2020-09-22 17:34:43 -07:00
Reid Kleckner	90242caca2	Revert "[CodeGen] emit CG profile for COFF object file" This reverts commit `91aed9bf97`, it is causing link errors.	2020-09-22 13:47:39 -07:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Mircea Trofin	d1e0f9f3cf	[NFC][regalloc] Simplify/conform to style guide indvars in Greedy Differential Revision: https://reviews.llvm.org/D88055	2020-09-22 10:55:52 -07:00
Simon Pilgrim	4dada8d617	[DAG] Remove DAGTypeLegalizer::GenWidenVectorTruncStores (PR42046) Just scalarize trunc stores - GenWidenVectorTruncStores does the same thing but is flawed (PR42046) and unused. Differential Revision: https://reviews.llvm.org/D87708	2020-09-22 17:24:45 +01:00
Alexandre Ganea	723fea2307	Silence 'warning: unused variable' when compiling with Clang 10.0	2020-09-22 12:17:40 -04:00
Michael Liao	534f6e1718	[PeepholeOptimizer] Enhance the redundant COPY elimination. - Eliminate redundant COPYs from the same register & subregister pair. Differential Revision: https://reviews.llvm.org/D87939	2020-09-22 10:11:37 -04:00
Muhammad Omair Javaid	73a6a164b8	Revert "Reapply Revert "RegAllocFast: Rewrite and improve"" This reverts commit `55f9f87da2`. Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306 http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154	2020-09-22 14:40:06 +05:00
Mircea Trofin	6a6b06f526	[NFC][regalloc] Use reverse iterator ranges for improved readability Differential Revision: https://reviews.llvm.org/D88047	2020-09-21 14:58:37 -07:00
Martin Storsjö	36c64af9d7	[CodeGen] [WinException] Only produce handler data at the end of the function if needed If we are going to write handler data (that is written as variable length data following after the unwind info in .xdata), we need to emit the handler data immediately, but for cases where no such info is going to be written, skip emitting it right away. (Unwind info for all remaining functions that hasn't gotten it emitted directly is emitted at the end.) This does slightly change the ordering of sections (triggering a bunch of updates to DebugInfo/COFF tests), but the change should be benign. This also matches GCC's assembly output, which doesn't output .seh_handlerdata unless it actually is needed. For ARM64, the unwind info can be packed into the runtime function entry itself (leaving no data in the .xdata section at all), but that can only be done if there's no follow-on data in the .xdata section. If emission of the unwind info is triggered via EmitWinEHHandlerData (or the .seh_handlerdata directive), which implicitly switches to the .xdata section, there's a chance of the caller wanting to pass further data there, so the packed format can't be used in that case. Differential Revision: https://reviews.llvm.org/D87448	2020-09-21 23:42:59 +03:00
Matt Arsenault	55f9f87da2	Reapply Revert "RegAllocFast: Rewrite and improve" This reverts commit `dbd53a1f0c`. Needed lldb test updates	2020-09-21 15:45:27 -04:00
Simon Pilgrim	6a0ed57a22	ImplicitNullChecks.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.	2020-09-21 17:42:57 +01:00
Simon Pilgrim	3ae07b2a33	TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 17:17:11 +01:00
Simon Pilgrim	ce294ff8cd	MachineCSE.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 16:54:26 +01:00
Denis Antrushin	ee86688b81	[Statepoints][ISEL] gc.relocate uniquification should be based on SDValue, not IR Value. When exporting statepoint results to virtual registers we try to avoid generating exports for duplicated inputs. But we erroneously use IR Value* to check if inputs are duplicated. Instead, we should use SDValue, because even different IR values can get lowered to the same SDValue. I'm adding a (degenerate) test case which emphasizes importance of this feature for invoke statepoints. If we fail to export only unique values we will end up with something like that: %0 = STATEPOINT %1 = COPY %0 landing_pad: <use of %1> And when exceptional path is taken, %1 is left uninitialized (COPY is never execute). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87695	2020-09-21 19:44:46 +07:00
Alexander Belyaev	17dc729bd4	Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit" This reverts commit `0345d88de6`. Google internal backend uses EntrySU, we are looking into removing dependency on it. Differential Revision: https://reviews.llvm.org/D88018	2020-09-21 13:33:05 +02:00
Lucas Prates	53d238a961	[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla #47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844	2020-09-21 10:05:34 +01:00
Fangrui Song	dbc616e982	[EHStreamer] Fix a "Continue to action" -fverbose-asm comment when multi-byte LEB128 encoding is needed This only happens with more than 64 action records and it is difficult to construct a test.	2020-09-20 21:41:48 -07:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Fangrui Song	d06485685d	[XRay] Change mips to use version 2 sled (PC-relative address) Follow-up to D78590. All targets use PC-relative addresses now. Reviewed By: atanasyan, dberris Differential Revision: https://reviews.llvm.org/D87977	2020-09-20 17:59:57 -07:00
Fangrui Song	6913812abc	Fix some clang-tidy bugprone-argument-comment issues	2020-09-19 20:41:25 -07:00
Eric Christopher	dbd53a1f0c	Temporarily Revert "RegAllocFast: Rewrite and improve" as it's breaking a few tests in the lldb test suite. Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio This reverts commit `c8757ff3aa`.	2020-09-18 18:11:21 -07:00
Fangrui Song	2ac06241d2	[LiveDebugValues] Add `#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` to suppress -Wunused-function	2020-09-18 17:25:37 -07:00
Amara Emerson	5d34d7f1a0	[GlobalISel] Add lowering support for G_ABS and use for AArch64. Differential Revision: https://reviews.llvm.org/D87952	2020-09-18 16:17:18 -07:00
Reid Kleckner	9932561b48	[COFF] Move per-global .drective emission from AsmPrinter to TLOFCOFF This changes the order of output sections and the output assembly, but is otherwise NFC. It simplifies the TLOF interface by removing two COFF-only methods.	2020-09-18 14:31:01 -07:00
James Y Knight	f7a53d82c0	PR47468: Fix findPHICopyInsertPoint, so that copies aren't incorrectly inserted after an INLINEASM_BR. findPHICopyInsertPoint special cases placement in a block with a callbr or invoke in it. In that case, we must ensure that the copy is placed before the INLINEASM_BR or call instruction, if the register is defined prior to that instruction, because it may jump out of the block. Previously, the code placed it immediately after the last def _or use_. This is wrong, if the use is the instruction which may jump. We could correctly place it immediately after the last def (ignoring uses), but that is non-optimal for register pressure. Instead, place the copy after the last def, or before the call/inlineasm_br, whichever is later. Differential Revision: https://reviews.llvm.org/D87865	2020-09-18 14:14:04 -04:00
Matt Arsenault	3105d0f84b	CodeGen: Move split block utility to MachineBasicBlock AMDGPU needs this in several places, so consolidate them here.	2020-09-18 14:05:18 -04:00
Matt Arsenault	c8757ff3aa	RegAllocFast: Rewrite and improve This rewrites big parts of the fast register allocator. The basic strategy of doing block-local allocation hasn't changed but I tweaked several details: Track register state on register units instead of physical registers. This simplifies and speeds up handling of register aliases. Process basic blocks in reverse order: Definitions are known to end register livetimes when walking backwards (contrary when walking forward then uses may or may not be a kill so we need heuristics). Check register mask operands (calls) instead of conservatively assuming everything is clobbered. Enhance heuristics to detect killing uses: In case of a small number of defs/uses check if they are all in the same basic block and if so the last one is a killing use. Enhance heuristic for copy-coalescing through hinting: We check the first k defs of a register for COPYs rather than relying on there just being a single definition. When testing this on the full llvm test-suite including SPEC externals I measured: average 5.1% reduction in code size for X86, 4.9% reduction in code on aarch64. (ranging between 0% and 20% depending on the test) 0.5% faster compiletime (some analysis suggests the pass is slightly slower than before, but we more than make up for it because later passes are faster with the reduced instruction count) Also adds a few testcases that were broken without this patch, in particular bug 47278. Patch mostly by Matthias Braun	2020-09-18 14:05:18 -04:00
Matt Arsenault	870fd53e4f	Reapply "RegAllocFast: Record internal state based on register units" The regressions this caused should be fixed when https://reviews.llvm.org/D52010 is applied. This reverts commit `a21387c654`.	2020-09-18 14:05:18 -04:00
Zequan Wu	91aed9bf97	[CodeGen] emit CG profile for COFF object file I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775) Differential Revision: https://reviews.llvm.org/D87811	2020-09-18 10:57:54 -07:00
Francis Visoiu Mistrih	0345d88de6	[NFC][ScheduleDAG] Remove unused EntrySU SUnit EntrySU doesn't seem to be used at all when building the ScheduleDAG. Differential Revision: https://reviews.llvm.org/D87867	2020-09-18 09:50:47 -07:00
Simon Pilgrim	d967aaa8fa	[DAG] BuildVectorSDNode::getSplatValue - pull out repeated getNumOperands() calls. NFCI.	2020-09-18 16:10:23 +01:00
Matt Arsenault	751a6c5760	IR: Move denormal mode parsing from MachineFunction to Function This was just inspecting the IR to begin with, and is useful to check in some places in the IR.	2020-09-18 09:55:47 -04:00
Philip Reames	b04c181ed7	[AArch64] Enable implicit null check transformation This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support: An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata. FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG. FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.) When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction. As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.) Differential Revision: https://reviews.llvm.org/D87851	2020-09-17 16:00:19 -07:00
Quentin Colombet	99e865b618	[TargetRegisterInfo] Add a couple of target hooks for the greedy register allocator Before this patch, the last chance recoloring and deferred spilling techniques were solely controled by command line options. This patch adds target hooks for these two techniques so that it is easier for backend writers to override the default behavior. The default behavior of the hooks preserves the default values of the related command line options. NFC	2020-09-17 15:23:15 -07:00
Derek Schuff	0ff28fa6a7	Support dwarf fission for wasm object files Initial support for dwarf fission sections (-gsplit-dwarf) on wasm. The most interesting change is support for writing 2 files (.o and .dwo) in the wasm object writer. My approach moves object-writing logic into its own function and calls it twice, swapping out the endian::Writer (W) in between calls. It also splits the import-preparation step into its own function (and skips it when writing a dwo). Differential Revision: https://reviews.llvm.org/D85685	2020-09-17 14:42:41 -07:00
Victor Huang	a4bb71b1c0	Disable hoisting MI to hotter basic blocks when using pgo This is a follow up patch for https://reviews.llvm.org/D63676 to enable the feature when using pgo. Differential Revision: https://reviews.llvm.org/D85240	2020-09-17 14:17:00 -05:00
Amara Emerson	79b21fc187	[AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors. For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break down the source vector into the final source vectors of <2 x s64> using unmerge. This fixes a crash due to using the wrong number of elements for the breakdown type. Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have. Differential Revision: https://reviews.llvm.org/D87814	2020-09-17 08:56:26 -07:00
Simon Pilgrim	2a56a0ba08	ModuloSchedule.cpp - remove unnecessary includes. NFCI. Already included in ModuloSchedule.h	2020-09-17 16:47:48 +01:00
Simon Pilgrim	85ba2f1663	LiveDebugVariables.cpp - remove unnecessary Compiler.h include. NFCI. Already included in LiveDebugVariables.h	2020-09-17 15:06:02 +01:00
Simon Pilgrim	46e59062a0	DwarfExpression.cpp - remove unnecessary includes. NFCI. Already included in DwarfExpression.h	2020-09-17 15:06:02 +01:00
Simon Pilgrim	67ae46c820	SafeStackLayout.cpp - remove unnecessary StackLifetime.h include. NFCI. Already included in SafeStackLayout.h	2020-09-17 14:56:46 +01:00
Simon Pilgrim	572e542c5e	DwarfStringPool.cpp - remove unnecessary StringRef include. NFCI. Already included in DwarfStringPool.h	2020-09-17 12:18:27 +01:00
Simon Pilgrim	71f237506b	DwarfFile.h - remove unnecessary includes. NFCI. Use forward declarations where possible, move includes down to DwarfFile.cpp and avoid duplicate includes.	2020-09-17 12:12:18 +01:00
Simon Pilgrim	550b1a6fd4	[AsmPrinter] DwarfDebug - use DebugLoc const references where possible. NFC. Avoid unnecessary copies.	2020-09-17 10:45:54 +01:00
Simon Pilgrim	4ae1bb193a	[AsmPrinter] Remove orphan DwarfUnit::shareAcrossDWOCUs declaration. NFCI. Method implementation no longer exists.	2020-09-17 10:45:52 +01:00
Jay Foad	6f6d389da5	[SplitKit] Only copy live lanes When splitting a live interval with subranges, only insert copies for the lanes that are live at the point of the split. This avoids some unnecessary copies and fixes a problem where copying dead lanes was generating MIR that failed verification. The test case for this is test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir. Without this fix, some earlier live range splitting would create %430: %430 [256r,848r:0)[848r,2584r:1) 0@256r 1@848r L0000000000000003 [848r,2584r:0) 0@848r L0000000000000030 [256r,2584r:0) 0@256r weight:1.480938e-03 ... 256B undef %430.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 848B %430.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %430:vreg_128 Then RAGreedy::tryLocalSplit would split %430 into %432 and %433 just before 848B giving: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 %433 [844r,848r:0)[848r,2584r:1) 0@844r 1@848r L0000000000000030 [844r,2584r:0) 0@844r L0000000000000003 [844r,844d:0)[848r,2584r:1) 0@844r 1@848r weight:2.831776e-03 ... 256B undef %432.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 { internal %433.sub2:vreg_128 = COPY %432.sub2:vreg_128 848B } %433.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %433:vreg_128 Note that the copy from %432 to %433 at 844B is a curious bundle-without-a-BUNDLE-instruction that SplitKit creates deliberately, and it includes a copy of .sub0 which is not live at this point, and that causes it to fail verification: * Bad machine code: No live subrange at use * - function: zextload_global_v64i16_to_v64i64 - basic block: %bb.0 (0x7faed48) [0B;2848B) - instruction: 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 - operand 1: %432.sub0:vreg_128 - interval: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 - at: 844B Using real bundles with a BUNDLE instruction might also fix this problem, but the current fix is less invasive and also avoids some unnecessary copies. https://bugs.llvm.org/show_bug.cgi?id=47492 Differential Revision: https://reviews.llvm.org/D87757	2020-09-17 09:26:11 +01:00

1 2 3 4 5 ...

29468 Commits