llvm-project

Commit Graph

Author	SHA1	Message	Date
Qiu Chaofan	a2fb5446be	[SelectionDAG] Check any use of negation result before removal `2508ef01` fixed a bug about constant removal in negation. But after sanitizing check I found there's still some issue about it so it's reverted. Temporary nodes will be removed if useless in negation. Before the removal, they'd be checked if any other nodes used it. So the removal was moved after getNode. However in rare cases the node to be removed is the same as result of getNode. We missed that and will be fixed by this patch. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87614	2020-09-17 16:00:54 +08:00
Igor Kudrin	027d47d1c7	[DebugInfo] Simplify DIEInteger::SizeOf(). An AsmPrinter should always be provided to the method because some forms depend on its parameters. The only place in the codebase which passed a nullptr value was found in the unit tests, so the patch updates it to use some dummy AsmPrinter instead. Differential Revision: https://reviews.llvm.org/D85293	2020-09-17 12:47:38 +07:00
Craig Topper	e30371d99d	[DAGCombiner] Teach visitMSTORE to replace an all ones mask with an unmasked store. Similar to what done in D87788 for MLOAD. Again I've skipped indexed, truncating, and compressing stores.	2020-09-16 16:42:22 -07:00
Craig Topper	89ee4c0314	[DAGCombiner] Teach visitMLOAD to replace an all ones mask with an unmasked load If we have an all ones mask, we can just a regular masked load. InstCombine already gets this in IR. But the all ones mask can appear after type legalization. Only avx512 test cases are affected because X86 backend already looks for element 0 and the last element being 1. It replaces this with an unmasked load and blend. The all ones mask is a special case of that where the blend will be removed. That transform is only enabled on avx2 targets. I believe that's because a non-zero passthru on avx2 already requires a separate blend so its more profitable to handle mixed constant masks. This patch adds a dedicated all ones handling to the target independent DAG combiner. I've skipped extending, expanding, and index loads for now. X86 doesn't use index so I don't know much about it. Extending made me nervous because I wasn't sure I could trust the memory VT had the right element count due to some weirdness in vector splitting. For expanding I wasn't sure if we needed different undef handling. Differential Revision: https://reviews.llvm.org/D87788	2020-09-16 13:21:16 -07:00
Matt Arsenault	88bdcbbf1a	GlobalISel: Lift store value widening restriction This doesn't change the memory size and doesn't need to worry about non-power-of-2 sizes.	2020-09-16 14:25:07 -04:00
Michael Kitzan	c4e589b795	[GISel] Add new combines for unary FP instrs with constant operand https://reviews.llvm.org/D86393 Patch adds five new `GICombinerRules`, one for each of the following unary FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The combine rules perform the FP operation on the constant operand and replace the original instr with the result. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules.	2020-09-16 10:34:15 -07:00
Simon Pilgrim	8f7d6b2375	DwarfUnit.h - remove unnecessary includes. NFCI.	2020-09-16 18:32:29 +01:00
Simon Pilgrim	69682f993c	InterferenceCache.cpp - remove duplicate includes. NFCI. Remove headers already included in InterferenceCache.h	2020-09-16 18:32:28 +01:00
Matt Arsenault	738c73a454	RegAllocFast: Make self loop live-out heuristic more aggressive This currently has no impact on code, but prevents sizeable code size regressions after D52010. This prevents spilling and reloading all values inside blocks that loop back. Add a baseline test which would regress without this patch.	2020-09-16 13:12:38 -04:00
Matt Arsenault	8d8a496356	LocalStackSlotAllocation: Swap order of check	2020-09-16 12:56:40 -04:00
Francesco Petrogalli	15e9a6c211	[llvm][CodeGen] Do not scalarize `llvm.masked.[gather\|scatter]` operating on scalable vectors. This patch prevents the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics to be scalarized when invoked on scalable vectors. The change in `Function.cpp` is needed to prevent the warning that is raised when `getNumElements` is used in place of `getElementCount` on `VectorType` instances. The tests guards for regressions on this change. The tests makes sure that calls to `llvm.masked.[gather\|scatter]` are still scalarized when: # the intrinsics are operating on fixed size vectors, and # the compiler is not targeting fixed length SVE code generation. Reviewed By: efriedma, sdesmalen Differential Revision: https://reviews.llvm.org/D86249	2020-09-16 16:00:28 +00:00
Mircea Trofin	6e85c3d5c7	[NFC][Regalloc] accessors for 'reg' and 'weight' Also renamed the fields to follow style guidelines. Accessors help with readability - weight mutation, in particular, is easier to follow this way. Differential Revision: https://reviews.llvm.org/D87725	2020-09-16 08:28:57 -07:00
Sebastian Neubauer	833b3b0d3a	[AMDGPU] Add v3f16/v3i16 support to SDag Fix lowering and instruction selection for v3x16 types and enable InstCombine to emit them. This patch only implements it for the selection dag. GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and GlobalISel/llvm.amdgcn.image.store.2d.d16.ll still don't work. Differential Revision: https://reviews.llvm.org/D84420	2020-09-16 17:20:27 +02:00
Sam Parker	1c421046d7	[RDA] Fix getUniqueReachingDef for self loops We've fixed the case where this could return an instruction after the given instruction, but also means that we can falsely return a 'unique' def when they could be one coming from the backedge of a loop. Differential Revision: https://reviews.llvm.org/D87751	2020-09-16 12:44:23 +01:00
Simon Pilgrim	3f682611ab	[DAG] Remover getOperand() call. NFCI.	2020-09-16 11:18:58 +01:00
Volkan Keles	79378b1b75	GlobalISel: Fix a failing combiner test test/CodeGen/AArch64/GlobalISel/combine-trunc.mir was failing due to the different order for evaluating function arguments. This patch updates the related code to fix the issue.	2020-09-15 16:40:38 -07:00
Aditya Nandakumar	97203cfd6b	[GISel] Add new GISel combiners for G_MUL https://reviews.llvm.org/D87668 Patch adds two new GICombinerRules, one for G_MUL(X, 1) and another for G_MUL(X, -1). G_MUL(X, 1) is an identity combine, and G_MUL(X, -1) gets replaced with G_SUB(0, X). Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules, as well as updates AMDGPU GISel tests. Patch by mkitzan	2020-09-15 16:08:47 -07:00
Volkan Keles	a4e35cc2ec	GlobalISel: Add combines for G_TRUNC https://reviews.llvm.org/D87050	2020-09-15 15:50:34 -07:00
Guozhi Wei	243ffd0cad	[MachineBasicBlock] Fix a typo in function copySuccessor The condition used to decide if need to copy probability should be reversed. Differential Revision: https://reviews.llvm.org/D87417	2020-09-15 09:18:18 -07:00
Qiu Chaofan	e1669843f2	Revert "[SelectionDAG] Remove unused FP constant in getNegatedExpression" `2508ef01` doesn't totally fix the issue since we did not handle the case when unused temporary negated result is the same with the result, which is found by address sanitizer.	2020-09-15 22:03:50 +08:00
Hans Wennborg	a21387c654	Revert "RegAllocFast: Record internal state based on register units" This seems to have caused incorrect register allocation in some cases, breaking tests in the Zig standard library (PR47278). As discussed on the bug, revert back to green for now. > Record internal state based on register units. This is often more > efficient as there are typically fewer register units to update > compared to iterating over all the aliases of a register. > > Original patch by Matthias Braun, but I've been rebasing and fixing it > for almost 2 years and fixed a few bugs causing intermediate failures > to make this patch independent of the changes in > https://reviews.llvm.org/D52010. This reverts commit `66251f7e1d`, and follow-ups `931a68f26b` and `0671a4c508`. It also adjust some test expectations.	2020-09-15 13:25:41 +02:00
Simon Pilgrim	6c1f2a34fb	SpillPlacement.cpp - remove unnecessary includes. NFCI. These are all directly included in SpillPlacement.h	2020-09-15 12:18:24 +01:00
Simon Pilgrim	1abb4461ea	StatepointLowering.cpp - remove unnecessary includes. NFCI. These are all directly included in StatepointLowering.h	2020-09-15 12:18:23 +01:00
Simon Pilgrim	bee79cdcc6	SelectionDAGBuilder.h - remove unnecessary includes. NFCI. Reduce to forward declarations and move implicit dependencies down to the cpp files.	2020-09-15 12:18:22 +01:00
Qiu Chaofan	2508ef014e	[SelectionDAG] Remove unused FP constant in getNegatedExpression `960cbc53` immediately removes nodes that won't be used to avoid compilation time explosion. This patch adds the removal to constants to fix PR47517. Reviewed By: RKSimon, steven.zhang Differential Revision: https://reviews.llvm.org/D87614	2020-09-15 17:59:10 +08:00
Petar Avramovic	9b4fa85434	GlobalISel/IRTranslator resetTargetOptions based on function attributes Update TargetMachine.Options with function attributes before we start to generate MIR instructions. This allows access to correct function attributes via TargetMachine.Options (it used to access attributes of the function that was translated first). This affects some existing tests with "no-nans-fp-math" attribute. Follow-up on D87456. Differential Revision: https://reviews.llvm.org/D87511	2020-09-15 10:26:09 +02:00
Igor Kudrin	a845ebd633	[DebugInfo] Make offsets of dwarf units 64-bit (19/19). In the case of LTO, several DWARF units can be emitted in one section. For an extremely large application, they may exceed the limit of 4GiB for 32-bit offsets. As it is now possible to emit 64-bit debugging info, the patch enables storing the larger offsets. Differential Revision: https://reviews.llvm.org/D87026	2020-09-15 12:23:32 +07:00
Igor Kudrin	8c19ac23bd	[DebugInfo] Make the offset of string pool entries 64-bit (18/19). The string pool is shared among several units in the case of LTO, and it potentially can exceed the limit of 4GiB for an extremely large application. As it is now possible to emit 64-bit debugging info, the limitation can be removed. Differential Revision: https://reviews.llvm.org/D87025	2020-09-15 12:23:32 +07:00
Igor Kudrin	7e1e4e81cb	[DebugInfo] Fix emitting DWARF64 .debug_macro[.dwo] sections (17/19). The patch fixes emitting flags and the debug_line_offset field in the header, as well as the reference to the macro string for a pre-standard GNU .debug_macro extension. Differential Revision: https://reviews.llvm.org/D87024	2020-09-15 12:23:31 +07:00
Igor Kudrin	a93dd26d8c	[DebugInfo] Fix emitting DWARF64 .debug_names sections (16/19). The patch fixes emitting the unit length field in the header of the table and offsets to the entry pool. Note that while the patch changes the common method to emit offsets, in fact, nothing is changed for Apple accelerator tables, because we do not yet support DWARF64 for those targets. Differential Revision: https://reviews.llvm.org/D87023	2020-09-15 12:23:31 +07:00
Igor Kudrin	00ce54689d	[DebugInfo] Fix emitting DWARF64 .debug_addr sections (15/19). The patch fixes emitting the header of the table. The content is independent of the DWARF format. Differential Revision: https://reviews.llvm.org/D87022	2020-09-15 12:23:31 +07:00
Igor Kudrin	3158d3dd4b	[DebugInfo] Fix emitting DWARF64 .debug_loclists sections (14/19). The size of the offsets in the table depends on the DWARF format. Differential Revision: https://reviews.llvm.org/D87020	2020-09-15 12:23:31 +07:00
Igor Kudrin	f9b242fe24	[DebugInfo] Fix emitting DWARF64 .debug_rnglists sections (13/19). The size of the offsets in the table depends on the DWARF format. Differential Revision: https://reviews.llvm.org/D87019	2020-09-15 12:23:31 +07:00
Igor Kudrin	03b09c6b68	[DebugInfo] Fix emitting pre-v5 name lookup tables in the DWARF64 format (12/19). The transition is done by using methods of AsmPrinter which automatically emit values in compliance with the selected DWARF format. Differential Revision: https://reviews.llvm.org/D87013	2020-09-15 12:23:30 +07:00
Igor Kudrin	b118030f3f	[DebugInfo] Fix emitting DWARF64 .debug_aranges sections (11/19). The patch fixes calculating the size of the table and emitting the fields which depend on the DWARF format by using methods that choose appropriate sizes automatically. Differential Revision: https://reviews.llvm.org/D87012	2020-09-15 12:23:30 +07:00
Igor Kudrin	18f23b3ecc	[DebugInfo] Fix emitting DWARF64 type units (10/19). The patch fixes emitting the offset to the type DIE. All other fields are already fixed in previous patches. Differential Revision: https://reviews.llvm.org/D87021	2020-09-15 11:31:07 +07:00
Igor Kudrin	924dc58076	[DebugInfo] Fix emitting DWARF64 DWO compilation units and string offset tables (9/19). These two fixes are better to go together because llvm-dwarfdump is unable to dump a table when another one is malformed. Differential Revision: https://reviews.llvm.org/D87018	2020-09-15 11:31:00 +07:00
Igor Kudrin	383d34c077	[DebugInfo] Fix emitting DWARF64 .debug_str_offsets sections (8/19). The patch fixes calculating the size of the table and emitting the unit length field. Differential Revision: https://reviews.llvm.org/D87017	2020-09-15 11:30:53 +07:00
Igor Kudrin	26f1f18831	[DebugInfo] Fix emitting the DW_AT_location attribute for 64-bit DWARFv3 (7/19). The patch uses a common method to determine the appropriate form for the value of the attribute. Differential Revision: https://reviews.llvm.org/D87016	2020-09-15 11:30:46 +07:00
Igor Kudrin	cae7c1eb78	[DebugInfo] Use a common method to determine a suitable form for section offsts (6/19). This is mostly an NFC patch because the involved methods are used when emitting DWO files, which is incompatible with DWARFv3, or for platforms where DWARF64 is not supported yet. Differential Revision: https://reviews.llvm.org/D87015	2020-09-15 11:30:38 +07:00
Igor Kudrin	5dd1c59188	[DebugInfo] Fix emitting DWARF64 compilation units (5/19). The patch also adds a method to choose an appropriate DWARF form to represent section offsets according to the version and the format of producing debug info. Differential Revision: https://reviews.llvm.org/D87014	2020-09-15 11:30:30 +07:00
Igor Kudrin	982b31fad2	[DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19). The patch adds a switch to enable emitting debug info in the 64-bit DWARF format. Most emitter for sections will be updated in the subsequent patches, whereas for .debug_line and .debug_frame the emitters are in the MC library, which is already updated. For now, the switch is enabled only for 64-bit ELF targets. Differential Revision: https://reviews.llvm.org/D87011	2020-09-15 11:30:18 +07:00
Igor Kudrin	c3c501f5d7	[DebugInfo] Add new emitting methods for values which depend on the DWARF format (3/19). These methods are going to be used in subsequent patches. Differential Revision: https://reviews.llvm.org/D87010	2020-09-15 11:30:10 +07:00
Igor Kudrin	a8058c6f8d	[DebugInfo] Fix DIE value emitters to be compatible with DWARF64 (2/19). DW_FORM_sec_offset and DW_FORM_strp imply values of different sizes with DWARF32 and DWARF64. The patch fixes DIE value classes to use correct sizes when emitting their values. For DIELocList it ensures that the requested DWARF form matches the current DWARF format because that class uses a method that selects the size automatically. Differential Revision: https://reviews.llvm.org/D87009	2020-09-15 11:30:02 +07:00
Igor Kudrin	380e746bcc	[DebugInfo] Fix methods of AsmPrinter to emit values corresponding to the DWARF format (1/19). These methods are used to emit values which are 32-bit in DWARF32 and 64-bit in DWARF64. The patch fixes them so that they choose the length automatically, depending on the DWARF format set in the Context. Differential Revision: https://reviews.llvm.org/D87008	2020-09-15 11:29:48 +07:00
Quentin Colombet	b3afad0463	[GlobalISel] Add a `X, Y = G_UNMERGE(G_ZEXT Z)` -> X = G_ZEXT Z; Y = 0 combine Add a combiner helper to transform unmerge of zext into one zext and a constant 0 Differential Revision: https://reviews.llvm.org/D87427	2020-09-14 17:27:23 -07:00
Quentin Colombet	d2321129bd	[GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z Add a combiner helper that replaces G_UNMERGE where all the destination lanes are dead except the first one with a G_TRUNC. Differential Revision: https://reviews.llvm.org/D87174	2020-09-14 17:27:23 -07:00
Quentin Colombet	a36278c2f8	[GlobalISel] Add G_UNMERGE(Cst) -> Cst1, Cst2, ... combine Add a combiner helper that replaces G_UNMERGE of big constants into direct use of smaller constants. Differential Revision: https://reviews.llvm.org/D87166	2020-09-14 16:30:18 -07:00
Aditya Nandakumar	46f9137e43	[GISel]: Add combine for G_FABS to G_FABS https://reviews.llvm.org/D87554 Patch adds one new GICombinerRule for G_FABS. The combine rule folds G_FABS(G_FABS(X)) to G_FABS(X). Patch additionally adds new combiner tests for the AArch64 target to test this new combiner rule. Patch by mkitzan.	2020-09-14 15:56:24 -07:00
Quentin Colombet	670c276232	[GlobalISel] Add G_UNMERGE_VALUES(G_MERGE_VALUES) combine Add the matching and applying function to the combiner helper for G_UNMERGE_VALUES(G_MERGE_VALUES). This combine also supports any merge-like input nodes, like G_BUILD_VECTORS and is robust against bitcasts in between int unmerge and merge nodes. When the input type of the merge node and the output type of the unmerge node are not the same, but the sizes are, the combine still applies but creates bitcasts between the sources and the destinations instead of reusing the destinations directly. Long term, the artifact combiner should probably reuse that helper, but as of today, it doesn't use any outside helper, so I kept it this way. Differential Revision: https://reviews.llvm.org/D87117	2020-09-14 15:45:06 -07:00
Craig Topper	c193a689b4	[SelectionDAG] Use Align/MaybeAlign in calls to getLoad/getStore/getExtLoad/getTruncStore. The versions that take 'unsigned' will be removed in the future. I tried to use getOriginalAlign instead of getAlign in some places. getAlign factors in the minimum alignment implied by the offset in the pointer info. Since we're also passing the pointer info we can use the original alignment. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87592	2020-09-14 13:54:50 -07:00
Craig Topper	4208ea3e19	[FastISel] Bail out of selectGetElementPtr for vector GEPs. The code that decomposes the GEP into ADD/MUL doesn't work properly for vector GEPs. It can create bad COPY instructions or possibly assert. For now just bail out to SelectionDAG. Fixes PR45906	2020-09-14 12:53:06 -07:00
Nikita Popov	53f36f06af	[Legalize][ARM][X86] Add float legalization for VECREDUCE This adds SoftenFloatRes, PromoteFloatRes and SoftPromoteHalfRes legalizations for VECREDUCE, to fill the remaining hole in the SDAG legalization. These legalizations simply expand the reduction and let it be recursively legalized. For the PromoteFloatRes case at least it is possible to do better than that, but it's pretty tricky (because we need to consider the interaction of three different vector legalizations and the type promotion) and probably not really worthwhile. I haven't added ExpandFloatRes support, as I am not familiar with ppc_fp128. Differential Revision: https://reviews.llvm.org/D87569	2020-09-14 20:42:09 +02:00
Nikita Popov	8e69c3cde8	[DAGCombiner] Fold fmin/fmax with INF / FLT_MAX Similar to D87415, this folds the various float min/max opcodes with a constant INF or -INF operand, or FLT_MAX / -FLT_MAX operand if the ninf flag is set. Some of the folds are only possible under nnan. The fminnum(X, INF) with nnan and fmaxnum(X, -INF) with nnan cases are needed to improve the VECREDUCE_FMIN/FMAX lowerings on X86, the rest is here for the sake of completeness. Differential Revision: https://reviews.llvm.org/D87571	2020-09-14 19:59:33 +02:00
Rahman Lavaee	7841e21c98	Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols. This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section. The format of the emitted data is represented as follows. It includes a header for every function: \| Address of the function \| -> 8 bytes (pointer size) \| Number of basic blocks in this function (>0) \| -> ULEB128 The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows: \| Offset of the basic block relative to function begin \| -> ULEB128 \| Binary size of the basic block \| -> ULEB128 \| BB metadata \| -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ] The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels. The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers. Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%. For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html Reviewed By: MaskRay, snehasish Differential Revision: https://reviews.llvm.org/D85408	2020-09-14 10:16:44 -07:00
David Green	06fb4e9064	[CGP] Limit converting phi types to simple loads and stores Instcombine limits converting phi types to simple loads and stores. This does the same in codegenprepare, not processing phis that are not simple. Note that volatile loads/store ISel will happily convert between float and int. Atomics are more likely to always be integer. This just keeps things simple and doesn't process either. Differential Revision: https://reviews.llvm.org/D83770	2020-09-14 12:08:34 +01:00
Petar Avramovic	6e2a86ed5a	AMDGPU/GlobalISel Check for NoNaNsFPMath in isKnownNeverSNaN Check for NoNaNsFPMath function attribute in isKnownNeverSNaN. Function attributes are in held in 'TargetMachine.Options'. Among other things, this allows selection of some patterns imported in D87351 since G_FCANONICALIZE is not generated when isKnownNeverSNaN returns true in lowerFMinNumMaxNum. However we notice some incorrect results since function attributes are not correctly written in TargetMachine.Options when next function is processed. Take a look at @v_test_no_global_nnans_med3_f32_pat0_srcmod0, it has "no-nans-fp-math"="false" but TargetMachine.Options still has it set to true since first function in test file had this attribute set to true. This will be fixed in D87511. Differential Revision: https://reviews.llvm.org/D87456	2020-09-14 12:11:00 +02:00
Simon Pilgrim	00e5676cf6	[LegalizeDAG] Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.	2020-09-14 11:09:43 +01:00
Jeremy Morse	d3af441dfe	[DebugInstrRef][1/9] Add fields for instr-ref variable locations Add a DBG_INSTR_REF instruction and a "debug instruction number" field to MachineInstr. The two allow variable values to be specified by identifying where the value is computed, rather than the register it lies in, like so: %0 = fooinst, debug-instr-number 1 [...] DBG_INSTR_REF 1, 0 See the original RFC for motivation: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html This patch is NFCI; it only adds fields and other boiler plate. Differential Revision: https://reviews.llvm.org/D85741	2020-09-14 10:06:52 +01:00
David Sherwood	15bff4dec4	[CodeGen] Fix bug in IncrementPointer In an earlier patch I meant to add the correct flags to the ADD node when incrementing the pointer, but forgot to pass them to SelectionDAG::getNode. Differential Revision: https://reviews.llvm.org/D87496	2020-09-14 08:03:55 +01:00
Yevgeny Rouban	88690a9658	[CodeGenPrepare] Fix zapping dead operands of assume This patch fixes a problem of the commit `52cc97a0`. A test case is created to demonstrate the crash caused by the instruction iterator invalidated by the recursive removal of dead operands of assume. The solution restarts from the blocks's first instruction in case CurInstIterator is invalidated by RecursivelyDeleteTriviallyDeadInstructions(). Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D87434	2020-09-14 11:46:34 +07:00
Craig Topper	56b33391d3	[SelectionDAG] Move ISD:PARITY formation from DAGCombine to SimplifyDemandedBits. Previously, we formed ISD::PARITY by looking for (and (ctpop X), 1) but the AND might be separated from the ctpop. For example if the parity result is multiplied by 2, we'll pull the AND through the shift. So to handle more cases, move to SimplifyDemandedBits where we can handle more cases that result in only the LSB of the CTPOP being used.	2020-09-13 21:04:13 -07:00
Qiu Chaofan	a4c5351986	[DAGCombiner] Propagate FMF flags in FMA folding DAG combiner folds (fma a 1.0 b) into (fadd a b) but the flag isn't propagated into new fadd. This patch fixes that. Some code in visitFMA is redundant and such support for vector constants is missing. Need follow-up patch to clean. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87037	2020-09-14 00:19:06 +08:00
David Green	9237fde481	[CGP] Prevent optimizePhiType from iterating forever The recently added optimizePhiType algorithm had no checks to make sure it didn't continually iterate backward and forth between float and int types. This means that given an input like store(phi(bitcast(load))), we could convert that back and forth to store(bitcast(phi(load))). This particular case would usually have been simplified to a different load type (folding the bitcast into the load) before CGP, but other cases can occur. The one that came up was phi(bitcast(phi)), where the two phi's of different types were bitcast between. That was not helped by a dead bitcast being kept around which could make conversion look profitable. This adds an extra check of the bitcast Uses or Defs, to make sure that at least one is grounded and will not end up being converted back. It also makes sure that dead bitcasts are removed, and there is a minor change to include newly created Phi nodes in the Visited set so that they do not need to be revisited. Differential Revision: https://reviews.llvm.org/D82676	2020-09-13 16:11:01 +01:00
Craig Topper	61d29e0dff	[LegalizeTypes] Remove a few cases from SplitVectorOperand that should never happen. NFC CTTZ, CTLZ, CTPOP, and FCANONICALIZE all have the same input and output types so the operand should have already been legalized when the result type was legalized.	2020-09-12 20:59:14 -07:00
Craig Topper	ad3d6f993d	[SelectionDAG][X86][ARM][AArch64] Add ISD opcode for __builtin_parity. Expand it to shifts and xors. Clang emits (and (ctpop X), 1) for __builtin_parity. If ctpop isn't natively supported by the target, this leads to poor codegen due to the expansion of ctpop being more complex than what is needed for parity. This adds a DAG combine to convert the pattern to ISD::PARITY before operation legalization. Type legalization is updated to handled Expanding and Promoting this operation. If after type legalization, CTPOP is supported for this type, LegalizeDAG will turn it back into CTPOP+AND. Otherwise LegalizeDAG will emit a series of shifts and xors followed by an AND with 1. I've avoided vectors in this patch to avoid more legalization complexity for this patch. X86 previously had a custom DAG combiner for this. This is now moved to Custom lowering for the new opcode. There is a minor regression in vector-reduce-xor-bool.ll, but a follow up patch can easily fix that. Fixes PR47433 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87209	2020-09-12 11:42:18 -07:00
Sanjay Patel	3a8ea8609b	[Intrinsics] define semantics for experimental fmax/fmin vector reductions As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html This is hopefully the final remaining showstopper before we can remove the 'experimental' from the reduction intrinsics. No behavior was specified for the FP min/max reductions, so we have a mess of different interpretations. There are a few potential options for the semantics of these max/min ops. I think this is the simplest based on current behavior/implementation: make the reductions inherit from the existing llvm.maxnum/minnum intrinsics. These correspond to libm fmax/fmin, and those are similar to the (now deprecated?) IEEE-754 maxNum/minNum functions (NaNs are treated as missing data). So the default expansion creates calls to libm functions. Another option would be to inherit from llvm.maximum/minimum (NaNs propagate), but most targets just crash in codegen when given those nodes because no default expansion was ever implemented AFAICT. We could also just assume 'nnan' semantics by default (we are already assuming 'nsz' semantics in the maxnum/minnum intrinsics), but some targets (AArch64, PowerPC) support the more defined behavior, so it doesn't make much sense to not allow a tighter spec. Fast-math-flags (nnan) can be used to loosen the semantics. (Note that D67507 was proposed to update the LangRef to acknowledge the more recent IEEE-754 2019 standard, but that patch seems to have stalled. If we do update based on the new standard, the reduction instructions can seamlessly inherit from whatever updates are made to the max/min intrinsics.) x86 sees a regression here on 'nnan' tests because we have underlying, longstanding bugs in FMF creation/propagation. Those need to be fixed apart from this change (for example: https://llvm.org/PR35538). The expansion sequence before this patch may not have been correct. Differential Revision: https://reviews.llvm.org/D87391	2020-09-12 09:10:28 -04:00
Yuanfang Chen	ad99e34c59	Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit `31ecf8d29d`. This reverts commit `3fdaa8602a`. There is laying violation for Target->CodeGen.	2020-09-11 18:52:32 -07:00
Yuanfang Chen	31ecf8d29d	[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html `CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83608	2020-09-11 16:41:17 -07:00
Matt Arsenault	382b2b1b51	RegAllocFast: Fix typo in comment	2020-09-11 18:06:14 -04:00
Matt Arsenault	e21bb31eb6	CodeGen: Require SSA to run PeepholeOptimizer	2020-09-11 18:03:04 -04:00
Jeremy Morse	0caeaff123	[LiveDebugValues][NFC] Re-land `60db26a66d`, add instr-ref tests This was landed but reverted in `5b9c2b1bea` due to asan picking up a memory leak. This is fixed in the change to InstrRefBasedImpl.cpp. Original commit message follows: [LiveDebugValues][NFC] Add instr-ref tests, adapt old tests This patch adds a few tests in DebugInfo/MIR/InstrRef/ of interesting behaviour that the instruction referencing implementation of LiveDebugValues has. Mostly, these tests exist to ensure that if you give the "-experimental-debug-variable-locations" command line switch, the right implementation runs; and to ensure it behaves the same way as the VarLoc LiveDebugValues implementation. I've also touched roughly 30 other tests, purely to make the tests less rigid about what output to accept. DBG_VALUE instructions are usually printed with a trailing !debug-location indicating its scope: !debug-location !1234 However InstrRefBasedLDV produces new DebugLoc instances on the fly, meaning there sometimes isn't a numbered node when they're printed, making the output: !debug-location !DILocation(line: 0, blah blah) Which causes a ton of these tests to fail. This patch removes checks for that final part of each DBG_VALUE instruction. None of them appear to be actually checking the scope is correct, just that it's present, so I don't believe there's any loss in coverage here. Differential Revision: https://reviews.llvm.org/D83054	2020-09-11 12:14:44 +01:00
Benjamin Kramer	5405ee553a	[CodeGenPrepare] Simplify code. NFCI.	2020-09-11 11:24:08 +02:00
Martin Storsjö	46416f0803	[CodeGen] [WinException] Remove a redundant explicit section switch for aarch64 The following EmitWinEHHandlerData() implicitly switches to .xdata, just like on x86_64. This became orphaned from the original code requiring it in `0b61d220c9` / https://reviews.llvm.org/D61095. Differential Revision: https://reviews.llvm.org/D87447	2020-09-11 10:31:04 +03:00
Alok Kumar Sharma	e45b0708ae	[DebugInfo] Fixing CodeView assert related to lowerBound field of DISubrange. This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287 after DIsSubrange upgrade D80197 Assert condition is now removed and Count is calculated in case LowerBound is absent or zero and Count or UpperBound is constant. If Count is unknown it is later handled as VLA (currently Count is set to zero). Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87406	2020-09-11 11:12:49 +05:30
Reid Kleckner	2c73bef7fa	Fix wrong comment about enabling optimizations to work around a bug	2020-09-10 16:45:20 -07:00
Reid Kleckner	4e3edef4b8	Use pragmas to work around MSVC x86_32 debug miscompile bug Halide users reported this here: https://llvm.org/pr46176 I reported the issue to MSVC here: https://developercommunity.visualstudio.com/content/problem/1179643/msvc-copies-overaligned-non-trivially-copyable-par.html This codepath is apparently not covered by LLVM's unit tests, so I added coverage in a unit test. If we want to support this configuration going forward, it means that is in general not safe to pass a SmallVector<T, N> by value if alignof(T) is greater than 4. This doesn't appear to come up often because passing a SmallVector by value is inefficient and not idiomatic: it copies the inline storage. In this case, the SmallVector<LLT,4> is captured by value by a lambda, and the lambda is passed by value into std::function, and that's how we hit the bug. Differential Revision: https://reviews.llvm.org/D87475	2020-09-10 14:50:01 -07:00
Volkan Keles	d4bf90271f	GlobalISel: Combine fneg(fneg x) to x https://reviews.llvm.org/D87473	2020-09-10 12:57:38 -07:00
Anna Thomas	b1b9806370	[ImplicitNullChecks] NFC: Remove unused PointerReg arg in dep analysis The PointerReg arg was passed into the dependence function for an assertion which no longer exists. So, this patch updates the dependence functions to avoid the PointerReg in the signature. Tests-Run: make check	2020-09-10 15:31:57 -04:00
Anna Thomas	46329f6079	[ImplicitNullCheck] Handle instructions that preserve zero value This is the first in a series of patches to make implicit null checks more general. This patch identifies instructions that preserves zero value of a register and considers that as a valid instruction to hoist along with the faulting load. See added testcases. Reviewed-By: reames, dantrushin Differential Revision: https://reviews.llvm.org/D87108	2020-09-10 13:39:50 -04:00
Simon Pilgrim	f42f733af9	SwitchLoweringUtils.h - reduce TargetLowering.h include. NFCI. Only include the headers we actually need, and move the remaining includes down to implicit dependent files.	2020-09-10 17:42:18 +01:00
Jay Foad	517202c720	[TargetLowering] Fix comments describing XOR -> OR/AND transformations	2020-09-10 13:56:34 +01:00
Kerry McLaughlin	cd89f5c91b	[SVE][CodeGen] Legalisation of truncate for scalable vectors Truncating from an illegal SVE type to a legal type, e.g. `trunc <vscale x 4 x i64> %in to <vscale x 4 x i32>` fails after PromoteIntOp_CONCAT_VECTORS attempts to create a BUILD_VECTOR. This patch changes the promote function to create a sequence of INSERT_SUBVECTORs if the return type is scalable, and replaces these with UNPK+UZP1 for AArch64. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D86548	2020-09-10 11:35:33 +01:00
Nikita Popov	0a5dc7effb	[DAGCombiner] Fold fmin/fmax of NaN fminnum(X, NaN) is X, fminimum(X, NaN) is NaN. This mirrors the behavior of existing InstSimplify folds. This is expected to improve the reduction lowerings in D87391, which use NaN as a neutral element. Differential Revision: https://reviews.llvm.org/D87415	2020-09-09 23:53:32 +02:00
Amara Emerson	e5784ef8f6	[GlobalISel] Enable usage of BranchProbabilityInfo in IRTranslator. We weren't using this before, so none of the MachineFunction CFG edges had the branch probability information added. As a result, block placement later in the pipeline was flying blind. This is enabled only with optimizations enabled like SelectionDAG. Differential Revision: https://reviews.llvm.org/D86824	2020-09-09 14:31:12 -07:00
Amara Emerson	467a071285	[GlobalISel][IRTranslator] Generate better conditional branch lowering. This is a port of the functionality from SelectionDAG, which tries to find a tree of conditions from compares that are then combined using OR or AND, before using that result as the input to a branch. Instead of naively lowering the code as is, this change converts that into a sequence of conditional branches on the sub-expressions of the tree. Like SelectionDAG, we re-use the case block codegen functionality from the switch lowering utils, which causes us to generate some different code. The result of which I've tried to mitigate in earlier combine patches. Differential Revision: https://reviews.llvm.org/D86665	2020-09-09 13:16:11 -07:00
Amara Emerson	cc76da7ada	[GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less. This combine previously tried to take sequences like: %cond = G_ICMP pred, a, b G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and by inverting the compare predicate and swapping branch targets, delete the G_BR and instead have a single conditional branch to the falsebb. Since in an earlier patch we have a combine to fold not(icmp) into just an inverted icmp, we don't need this combine to do as much. This patch instead generalizes the combine by just looking for: G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and then inverting the condition using a not (xor). The xor can be folded away in a separate combine. This change also lets us avoid some optimization code in the IRTranslator. I also think that deleting G_BRs in the combiner is unnecessary. That's something that targets can decide to do at selection time and could simplify generic code in future. Differential Revision: https://reviews.llvm.org/D86664	2020-09-09 13:08:16 -07:00
Ulrich Weigand	1a25133bcd	[DAGCombine] Skip re-visiting EntryToken to avoid compile time explosion During the main DAGCombine loop, whenever a node gets replaced, the new node and all its users are pushed onto the worklist. Omit this if the new node is the EntryToken (e.g. if a store managed to get optimized out), because re-visiting the EntryToken and its users will not uncover any additional opportunities, but there may be a large number of such users, potentially causing compile time explosion. This compile time explosion showed up in particular when building the SingleSource/UnitTests/matrix-types-spec.cpp test-suite case on any platform without SIMD vector support. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86963	2020-09-09 19:13:46 +02:00
Alon Kom	818cf30b83	[MachinePipeliner] Fix II_setByPragma initialization II_setByPragma was not reset between 2 calls of the MachinePipleiner pass Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D87088	2020-09-09 13:38:35 +00:00
Denis Antrushin	4358fa782e	[Statepoints] Update DAG root after emitting statepoint. Since we always generate CopyToRegs for statepoint results, we must update DAG root after emitting statepoint, so that these copies are scheduled before any possible local uses. Note: getControlRoot() flushes all PendingExports, not only those we generates for relocates. If that'll become a problem, we can change it to flushing relocate exports only. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87251	2020-09-09 20:22:10 +07:00
Simon Pilgrim	d816499f95	[KnownBits] Move SelectionDAG::computeKnownBits ISD::ABS handling to KnownBits::abs Move the ISD::ABS handling to a KnownBits::abs handler, to simplify future implementations in ValueTracking/GlobalISel.	2020-09-09 13:22:58 +01:00
Denis Antrushin	2a52c3301a	[Statepoints] Properly handle const base pointer. Current code in InstEmitter assumes all GC pointers are either VRegs or stack slots - hence, taking only one operand. But it is possible to have constant base, in which case it occupies two machine operands. Add a convinience function to StackMaps to get index of next meta argument and use it in InsrEmitter to properly advance to the next statepoint meta operand. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87252	2020-09-09 14:07:00 +07:00
Craig Topper	844e94a502	[SelectionDAGBuilder] Remove Unnecessary FastMathFlags temporary. Use SDNodeFlags instead. NFCI This was a missed simplication in D87200	2020-09-08 15:50:12 -07:00
Craig Topper	b1e68f885b	[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.: This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130. This required adding a SDNodeFlags to SelectionDAG::getSetCC. Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines. Differential Revision: https://reviews.llvm.org/D87200	2020-09-08 15:27:21 -07:00
Volkan Keles	1242dd330d	GlobalISel: Combine `op undef, x` to 0 https://reviews.llvm.org/D86611	2020-09-08 09:46:38 -07:00
Simon Pilgrim	3c83b967cf	LiveRegUnits.h - reduce MachineRegisterInfo.h include. NFC. We only need to include MachineInstrBundle.h, but exposes an implicit dependency in MachineOutliner.h. Also, remove duplicate includes from LiveRegUnits.cpp + MachineOutliner.cpp.	2020-09-08 17:27:00 +01:00
Jonas Paulsson	6dc3e22b57	[DAGTypeLegalizer] Handle ZERO_EXTEND of promoted type in WidenVecRes_Convert. On SystemZ, a ZERO_EXTEND of an i1 vector handled by WidenVecRes_Convert() always ended up being scalarized, because the type action of the input is promotion which was previously an unhandled case in this method. This fixes https://bugs.llvm.org/show_bug.cgi?id=47132. Differential Revision: https://reviews.llvm.org/D86268 Patch by Eli Friedman. Review: Ulrich Weigand	2020-09-08 16:49:51 +02:00
Craig Topper	da79b1eecc	[SelectionDAG][X86][ARM] Teach ExpandIntRes_ABS to use sra+add+xor expansion when ADDCARRY is supported. Rather than using SELECT instructions, use SRA, UADDO/ADDCARRY and XORs to expand ABS. This is the multi-part version of the sequence we use in LegalizeDAG. It's also the same as the Custom sequence uses for i64 on 32-bit and i128 on 64-bit. So we can remove the X86 customization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D87215	2020-09-07 13:15:26 -07:00
Sanjay Patel	7a06b166b1	[DAGCombiner] allow more store merging for non-i8 truncated ops This is a follow-up suggested in D86420 - if we have a pair of stores in inverted order for the target endian, we can rotate the source bits into place. The "be_i64_to_i16_order" test shows a limitation of the current function (which might be avoided if we integrate this function with the other cases in mergeConsecutiveStores). In the earlier "be_i64_to_i16" test, we skip the first 2 stores because we do not match the full set as consecutive or rotate-able, but then we reach the last 2 stores and see that they are an inverted pair of 16-bit stores. The "be_i64_to_i16_order" test alters the program order of the stores, so we miss matching the sub-pattern. Differential Revision: https://reviews.llvm.org/D87112	2020-09-07 14:12:36 -04:00
Momchil Velikov	eb482afaf5	Reduce the number of memory allocations when displaying a warning about clobbering reserved registers (NFC). Also address some minor inefficiencies and style issues. Differential Revision: https://reviews.llvm.org/D86088	2020-09-07 17:04:00 +01:00

1 2 3 4 5 ...

29468 Commits