llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	47cc6db928	Re-land [Debug][CodeView] Emit fully qualified names for globals This reverts commit `525a591f0f`. Fixed an issue with pointers to members based on typedefs. In this case, LLVM would emit a second UDT. I fixed it by not passing the class type to getTypeIndex when the base type is not a function type. lowerType only uses the class type for direct function types. This suggests if we have a PMF with a function typedef, there may be an issue, but that can be solved separately.	2020-05-18 17:31:00 -07:00
Matt Arsenault	ae98939172	GlobalISel: Fold G_MUL x, 0, and G_*DIV 0, x	2020-05-18 18:08:26 -04:00
Amara Emerson	17842025ed	[GlobalISel] Add support for using vector values in memset inlining.	2020-05-18 14:56:16 -07:00
Matt Arsenault	3e315697ac	DAG: Use correct pointer size for llvm.ptrmask This was ignoring the address space, and would assert on address spaces with a different size from the default.	2020-05-18 16:46:11 -04:00
Craig Topper	c9f63297e2	Fix several places that were calling verifyFunction or verifyModule without checking the return value. verifyFunction/verifyModule don't assert or error internally. They also don't print anything if you don't pass a raw_ostream to them. So the caller needs to check the result and ideally pass a stream to get the messages. Otherwise they're just really expensive no-ops. I've filed PR45965 for another instance in SLPVectorizer that causes a lit test failure. Differential Revision: https://reviews.llvm.org/D80106	2020-05-18 13:28:46 -07:00
David Sherwood	364c595403	[SVE] Ignore scalable vectors in InterleavedLoadCombinePass I have changed the pass so that we ignore shuffle vectors with scalable vector types, and replaced VectorType with FixedVectorType in the rest of the pass. I couldn't think of an easy way to test this change, since for scalable vectors we shouldn't be using shufflevectors for interleaving. This change fixes up some type size assert warnings I found in the following test: CodeGen/AArch64/sve-intrinsics-int-arith-imm.ll Differential Revision: https://reviews.llvm.org/D79700	2020-05-18 16:35:55 +01:00
Hans Wennborg	525a591f0f	Revert `76c5f277f2` "Re-land [Debug][CodeView] Emit fully qualified names for globals" > Before this patch, S_[L\|G][THREAD32\|DATA32] records were emitted with a simple name, not the fully qualified name (namespace + class scope). > > Differential Revision: https://reviews.llvm.org/D79447 This causes asserts in Chromium builds: CodeViewDebug.cpp:2997: void llvm::CodeViewDebug::emitDebugInfoForUDTs(const std::vector<std::pair<std::string, const DIType *>> &): Assertion `OriginalSize == UDTs.size()' failed. I will follow up on the Phabricator issue.	2020-05-18 11:26:30 +02:00
OCHyams	709c52b955	[DebugInfo][DWARF] Emit a single location instead of a location list for variables in nested scopes (including inlined functions) if there is a single location which covers the entire scope and the scope is contained in a single block. Based on work by @jmorse. Reviewed By: vsk, aprantl Differential Revision: https://reviews.llvm.org/D79571	2020-05-18 09:43:32 +01:00
Mehdi Amini	8697d443ab	Fix warning "defined but not used" for debug function (NFC)	2020-05-17 23:50:18 +00:00
Mehdi Amini	ffc6e593d2	Replace dyn_cast with isa when the result isn't used (NFC) Fix build warning: unused variable 'BB'	2020-05-17 23:15:17 +00:00
Nikita Popov	52e98f620c	[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC) Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.	2020-05-17 22:19:15 +02:00
David Blaikie	a055e3856f	DebugInfo: Reduce long-distance dependence on what will/won't emit a debug_addr section This is a no-op/NFC at the moment & generally makes the code /somewhat/ cleaner/less reliant on assumptions about what will produce a debug_addr section. It's still a bit "spooky action at a distance" - the add ranges code pre-emptively inserts addresses into the address pool it knows will eventually be used by the range emission code (or low/high pc). The 'ideal' would be either to actually compute the addresses needed for range (& loc) emission earlier - which would mean decanonicalizing the range/loc representation earlier to account for whether it was going to use addrx encodings or not (which would be unfortunate, but could be refactored to be relatively unobtrusive). Alternatively, emitting the range/loc sections earlier would cause them to request the needed addresses sooner - but then you endup having to split finalizeModuleInfo because some things need to be handled there before the ranges/locs are emitted, I think...	2020-05-17 12:45:56 -07:00
Craig Topper	796ae8cf82	[LegalizeDAG] Use MachinePointerInfo::getUnknownStack in place of MachinePointerInfo() in a couple places. NFC We know the pointer somewhere on the stack, we just don't know exactly where since the index may be variable. Differential Revision: https://reviews.llvm.org/D80060	2020-05-16 15:48:16 -07:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Sanjay Patel	5be37cb124	[x86][CGP] try to hoist funnel shift above select-of-splats This is basically the same patch as D63233, but converted to funnel shifts rather than regular shifts. I did not see a way to effectively share code for these 2 cases though. This follows D79718 and D79827 to re-fix PR37426 because that gets canonicalized to funnel shift intrinsics in IR. I did draft an alternative patch as an enhancement to "shouldSinkOperands()", but that was awkward because we have to key the transform from the select, but then look at both its users and its operands.	2020-05-16 10:44:47 -04:00
Simon Pilgrim	228913780b	DIEHash.cpp - remove headers explicitly included in DIEHash.h. NFC. Don't duplicate module header includes.	2020-05-16 15:00:57 +01:00
Simon Pilgrim	25656332f1	AggressiveAntiDepBreaker.cpp - remove headers explicitly included in AggressiveAntiDepBreaker.h. NFC. Don't duplicate module header includes.	2020-05-16 15:00:56 +01:00
Craig Topper	13d44b2a0c	[LegalizeDAG] Use getMemBasePlusOffset to simplify some code. Use other signature of getMemBasePlusOffset in another location. NFCI The code was calculating an offset from a stack pointer SDValue. This is exactly what getMemBasePlusOffset does. I also replaced sizeof(int) with a hardcoded 4. We know the type we're operating on is 4 bytes. But the size of int that the source code is being compiled with isn't guaranteed to be 4 bytes. While here replace another use of getMemBasePlusOffset that was proceeded with a call to getConstant with the other signature that call getConstant internally.	2020-05-16 01:02:08 -07:00
Craig Topper	45c7b3fd91	[LegalizeVectorTypes] Remove non-constnat INSERT_SUBVECTOR handling. NFC Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-15 23:56:13 -07:00
Ten Tzen	e32f8e5d4a	[Windows EH] Fix the order of Nested try-catches in $tryMap$ table This bug is exposed by Test7 of ehthrow.cxx in MSVC EH suite where a rethrow occurs in a try-catch inside a catch (i.e., a nested Catch handlers). See the test code in https://github.com/microsoft/compiler-tests/blob/master/eh/ehthrow.cxx#L346 When an object is rethrown in a Catch handler, the copy-ctor of this object must be executed after the destructions of live objects, but BEFORE the dtors of live objects in parent handlers. Today Windows 64-bit runtime (__CxxFrameHandler3 & 4) expects nested Catch handers are stored in pre-order (outer first, inner next) in $tryMap$ table, so that given a State, its Catch's beginning State can be properly retrieved. The Catch beginning state (which is also the ending State) is the State where rethrown object's copy-ctor must take place. LLVM currently stores nested catch handlers in post-ordering because it's the natural way to compute the highest State in Catch. The fix is to simply store TryCatch handler in pre-order, but update Catch's highest State after child Catches are all processed. Differential Revision: https://reviews.llvm.org/D79474?id=263919	2020-05-15 22:03:43 -07:00
Diogo Sampaio	6c68f75ee4	Prevent register coalescing in functions whith setjmp Summary: In the the given example, a stack slot pointer is merged between a setjmp and longjmp. This pointer is spilled, so it does not get correctly restored, addinga undefined behaviour where it shouldn't. Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb Reviewers: eli.friedman, thanm, efriedma Reviewed By: efriedma Subscribers: MatzeB, qcolombet, tpr, rnk, efriedma, hiraditya, llvm-commits, chill Tags: #llvm Differential Revision: https://reviews.llvm.org/D77767	2020-05-16 00:36:34 +01:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Alexandre Ganea	76c5f277f2	Re-land [Debug][CodeView] Emit fully qualified names for globals Before this patch, S_[L\|G][THREAD32\|DATA32] records were emitted with a simple name, not the fully qualified name (namespace + class scope). Differential Revision: https://reviews.llvm.org/D79447	2020-05-15 10:37:09 -04:00
David Sherwood	fb1c55b57d	[CodeGen] Fix FoldConstantVectorArithmetic for scalable vectors For now I have changed FoldConstantVectorArithmetic to return early if we encounter a scalable vector, since the subsequent code assumes you can perform lane-wise constant folds. However, in future work we should be able to extend this to look at splats of a constant value and fold those if possible. I have also added the same code to FoldConstantArithmetic, since that deals with vectors too. The warnings I fixed in this patch were being generated by this existing test: CodeGen/AArch64/sve-int-arith.ll Differential Revision: https://reviews.llvm.org/D79421	2020-05-15 14:58:44 +01:00
Ties Stuij	8c24f33158	[IR][BFloat] Add BFloat IR type Summary: The BFloat IR type is introduced to provide support for, initially, the BFloat16 datatype introduced with the Armv8.6 architecture (optional from Armv8.2 onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE 754 floating point IR type. This is part of a patch series upstreaming Armv8.6 features. Subsequent patches will upstream intrinsics support and C-lang support for BFloat. Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith Tags: #llvm Differential Revision: https://reviews.llvm.org/D78190	2020-05-15 14:43:43 +01:00
Simon Pilgrim	9d4b4f344d	DAGCombiner.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-15 12:41:35 +01:00
Konstantin Schwarz	5425cdc3ad	[GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified Summary: D78319 introduced basic support for inline asm input operands in GlobalISel. However, that patch did not handle the case where a memory input operand still needs to be indirectified. Later code asserts that the memory operand is already indirect. This patch adds an early return false to trigger the SelectionDAG fallback for now. Reviewers: arsenm, paquette Reviewed By: arsenm Subscribers: thakis, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79955	2020-05-15 13:37:06 +02:00
David Sherwood	8ce4a8f6df	[CodeGen] Refactor CreateStackTemporary I've created a new variant of CreateStackTemporary that takes TypeSize and Align arguments, and made the older instances of CreateStackTemporary call this new function. This refactoring is in preparation for more patches in this area related to scalable vectors and improving the alignment calculations. Differential Revision: https://reviews.llvm.org/D79933	2020-05-15 07:29:13 +01:00
Alok Kumar Sharma	4042ada1c1	[DebugInfo] support for DW_AT_data_location in llvm This patch adds support for DWARF attribute DW_AT_data_location. Summary: Dynamic arrays in fortran are described by array descriptor and data allocation address. Former is mapped to DW_AT_location and later is mapped to DW_AT_data_location. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D79592	2020-05-15 11:33:17 +05:30
Alok Kumar Sharma	ab699d78a2	[DebugInfo] llvm rejects DWARF operator DW_OP_push_object_address llvm rejects DWARF operator DW_OP_push_object_address.This DWARF operator is needed for Flang to support allocatable array. Summary: Currently llvm rejects DWARF operator DW_OP_push_object_address. below error is produced when llvm finds this operator. [..] invalid expression !DIExpression(151) warning: ignoring invalid debug info in pushobj.ll [..] There are some parts missing in support of this operator, need to be completed. Testing -added a unit testcase -check-debuginfo -check-llvm Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D79306	2020-05-15 11:10:35 +05:30
Kang Zhang	aedb6615a8	[MachineVerifier] Use the for_range loop to instead llvm::any_of Summary: In the patch D78849, it uses llvm::any_of to instead of for loop to simplify the function addRequired(). It's obvious that above code is not a NFC conversion. Because any_of will return if any addRequired(Reg) is true immediately, but we want every element to call addRequired(Reg). This patch uses for_range loop to fix above any_of bug. Reviewed By: MaskRay, nickdesaulniers Differential Revision: https://reviews.llvm.org/D79872	2020-05-15 02:35:33 +00:00
Nico Weber	e0c1554274	Revert "[GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified" This reverts commit `887dfeec53`. It broke irtranslator-inline-asm.ll on many bots, e.g. http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/38606/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Airtranslator-inline-asm.ll	2020-05-14 19:37:05 -04:00
Konstantin Schwarz	887dfeec53	[GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified Summary: D78319 introduced basic support for inline asm input operands in GlobalISel. However, that patch did not handle the case where a memory input operand still needs to be indirectified. Later code asserts that the memory operand is already indirect. This patch adds an early return false to trigger the SelectionDAG fallback for now. Reviewers: arsenm, paquette Reviewed By: arsenm Subscribers: wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79955	2020-05-14 23:42:31 +02:00
Stanislav Mekhanoshin	184b383457	Add v16f64 value type We need to use it to handle <16 x double> indirect indexes in the AMDGPU BE. The only visible change from adding it is in ARM cost model. To me it looks reasonable. With doubling a vector size it quadruples the cost up to the size 8 and then it did only double it. Now it also quadruples, which seems a logical progression to me. Actual AMDGPU code is to follow, this is a common part, plus load/store legalization in the AMDGPU BE not to break what works now. Differential Revision: https://reviews.llvm.org/D79952	2020-05-14 14:28:00 -07:00
Eli Friedman	accc6b5545	LoadInst should store Align, not MaybeAlign. The fact that loads and stores can have the alignment missing is a constant source of confusion: code that usually works can break down in rare cases. So fix the LoadInst API so the alignment is never missing. To reduce the number of changes required to make this work, IRBuilder and certain LoadInst constructors will grab the module's datalayout and compute the alignment automatically. This is the same alignment instcombine would eventually apply anyway; we're just doing it earlier. There's a minor risk that the way we're retrieving the datalayout could break out-of-tree code, but I don't think that's likely. This is the last in a series of patches, so most of the necessary changes have already been merged. Differential Revision: https://reviews.llvm.org/D77454	2020-05-14 13:19:21 -07:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Simon Pilgrim	acb6f1ae09	TargetLowering.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-14 18:13:58 +01:00
Jay Foad	17941437a2	[TargetLowering] Improve expansion of FSHL/FSHR Use an extra shift-by-1 instead of a compare and select to handle the shift-by-zero case. This sometimes saves one instruction (if the compare couldn't be combined with a previous instruction). It also works better on targets that don't have good select instructions. Note that currently this change doesn't affect most targets because expandFunnelShift is not used because funnel shift intrinsics are lowered early in SelectionDAGBuilder. But there is work afoot to change that; see D77152. Differential Revision: https://reviews.llvm.org/D77301	2020-05-14 16:36:22 +01:00
Sanjay Patel	26e742fd84	[x86][CGP] improve sinking of splatted vector shift amount operand Expands on the enablement of the shouldSinkOperands() TLI hook in: D79718 The last codegen/IR test diff shows what I suspected could happen - we were sinking all splat shift operands into a loop. But that's not what we want in general; we only want to sink the shift amount operand if it is a splat. Differential Revision: https://reviews.llvm.org/D79827	2020-05-14 08:36:03 -04:00
Simon Pilgrim	80715b7124	SelectionDAG.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-14 13:23:00 +01:00
Konstantin Schwarz	91063cf85a	[GlobalISel][InlineAsm] Add support for basic input operand constraints Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette Reviewed By: arsenm Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78319	2020-05-14 10:43:37 +02:00
Eric Christopher	bfa200ebcf	Remove an unused variable.	2020-05-13 15:13:02 -07:00
Eli Friedman	ed428c429e	[SelectionDAG] Require constant index for INSERT/EXTRACT_SUBVECTOR. It sounds like an interesting idea in theory, but nothing is actually taking advantage of it, and specifying/implementing the edge cases is painful. So just forbid it. Differential Revision: https://reviews.llvm.org/D79814	2020-05-13 13:08:59 -07:00
Craig Topper	de92dc2850	[Statepoint] Mark FixupStatepointCallerSaved as preserving the CFG I'm hoping this will restore some compile time lost by D75936 and D75937. Differential Revision: https://reviews.llvm.org/D79813	2020-05-13 10:59:44 -07:00
Benjamin Kramer	a8bf2deae4	[CodeGenPrepare] Remove a superflouos variable. NFC. Fixes a -Wunused-variable warning in Release builds.	2020-05-13 18:25:20 +02:00
David Green	fa15255d8a	[ARM] Convert floating point splats to integer Under MVE a vdup will always take a gpr register, not a floating point value. During DAG combine we convert the types to a bitcast to an integer in an attempt to fold the bitcast into other instructions. This is OK, but only works inside the same basic block. To do the same trick across a basic block boundary we need to convert the type in codegenprepare, before the splat is sunk into the loop. This adds a convertSplatType function to codegenprepare to do that, putting bitcasts around the splat to force the type to an integer. There is then some adjustment to the code in shouldSinkOperands to handle the extra bitcasts. Differential Revision: https://reviews.llvm.org/D78728	2020-05-13 15:24:16 +01:00
Sourabh Singh Tomar	e59744fd9b	[DebugInfo] Fortran module DebugInfo support in LLVM This patch extends DIModule Debug metadata in LLVM to support Fortran modules. DIModule is extended to contain File and Line fields, these fields will be used by Flang FE to create debug information necessary for representing Fortran modules at IR level. Furthermore DW_TAG_module is also extended to contain these fields. If these fields are missing, debuggers like GDB won't be able to show Fortran modules information correctly. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D79484	2020-05-13 12:52:30 +05:30
Fangrui Song	66055230bf	[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for -fno-unique-section-names GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee) .text : { (.text.unlikely .text._unlikely .text.unlikely.) (.text.exit .text.exit.) (.text.startup .text.startup.) (.text.hot .text.hot.) (SORT(.text.sorted.)) (.text .stub .text.* .gnu.linkonce.t.) / .gnu.warning sections are handled specially by elf.em. / (.gnu.warning) } Because `(.text.exit .text.exit.)` is ordered before `(.text .text.)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions. gold's `-z keep-text-section-prefix` has the same problem. In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem. In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"` will go to the output section `.text` instead of `.text.exit` when linked by lld. To address the problem, append a dot to become `.text.exit.` Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D79600	2020-05-12 14:14:17 -07:00
David Blaikie	aa99da5ace	Avoid binding pointers to "auto&" (by dereferencing the pointer that's non-null anyway) Based on @djtodoro's `2552dc5317`	2020-05-12 11:40:00 -07:00
Craig Topper	8c72b0271b	[CodeGen] Use Align in MachineConstantPool.	2020-05-12 10:06:40 -07:00
Jay Foad	989be65b11	[GlobalISel][IRTranslator] Fix <1 x Ty> handling in ConstantExprs Summary: ConstantExprs involving operations on <1 x Ty> could translate into MIR that failed to verify with: * Bad machine code: Reading virtual register without a def * The problem was that translate(const Constant &C, Register Reg) had recursive calls that passed the same Reg in for the translation of a subexpression, but without updating VMap for the subexpression first as translate(const Constant &C, Register Reg) expects. Fix this by using the same translateCopy helper function that we use for translating Instructions. In some cases this causes extra G_COPY MIR instructions to be generated. Fixes https://bugs.llvm.org/show_bug.cgi?id=45576 Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78378	2020-05-12 16:51:03 +01:00
Jay Foad	bd80a8bb87	[GlobalISel][IRTranslator] New helper function translateCopy. NFC. Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78377	2020-05-12 16:51:03 +01:00
James Y Knight	e9536795a3	Add comment for SelectionDAGBuilder::SL field.	2020-05-12 10:46:08 -04:00
Djordje Todorovic	8b7b84e99d	Revert "[NFC][DwarfDebug] Prefer explicit to auto type deduction" This wasn't proposed by the LLVM Style Guide. Please see https://reviews.llvm.org/D79624. This reverts commit rG2552dc5317e0.	2020-05-12 09:44:31 +02:00
Djordje Todorovic	41ca605813	Revert "[NFC][DwarfDebug] Avoid default capturing when using lambdas" Reverting this because we found it isn't that useful. Please see https://reviews.llvm.org/D79616. This reverts commit rG45e5a32a8bd3.	2020-05-12 09:37:28 +02:00
David Sherwood	42c7a6d52b	[CodeGen] Fix incorrect uses of getVectorNumElements() I have fixed up some places in SelectionDAG::getNode() where we used to assert that the number of vector elements for two types are the same. I have changed such cases to assert that the element counts are the same instead. I've added new tests that exercise the code paths for all the truncations. All the extend operations are covered by this existing test: CodeGen/AArch64/sve-sext-zext.ll For the ISD::SETCC case I fixed this code path is exercised by these existing tests: CodeGen/AArch64/sve-fcmp.ll CodeGen/AArch64/sve-intrinsics-int-compares-with-imm.ll Differential Revision: https://reviews.llvm.org/D79399	2020-05-12 07:50:37 +01:00
Eli Friedman	c9c930ae67	[SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment. allocas in LLVM IR have a specified alignment. When that alignment is specified, the alloca has at least that alignment at runtime. If the specified type of the alloca has a higher preferred alignment, SelectionDAG currently ignores that specified alignment, and increases the alignment. It does this even if it would trigger stack realignment. I don't think this makes sense, so this patch changes that. I was looking into this for SVE in particular: for SVE, overaligning vscale'ed types is extra expensive because it requires realigning the stack multiple times, or using dynamic allocation. (This currently isn't implemented.) I updated the expected assembly for a couple tests; in particular, for arg-copy-elide.ll, the optimization in question does not increase the alignment the way SelectionDAG normally would. For the rest, I just increased the specified alignment on the allocas to match what SelectionDAG was inferring. Differential Revision: https://reviews.llvm.org/D79532	2020-05-11 17:39:00 -07:00
Davide Italiano	288c9e8178	[GlobalISel] Remove debug locations when emitting G_FCONSTANT. <rdar://problem/62991543>	2020-05-11 16:25:03 -07:00
Sanjay Patel	5f05c2f59a	[CGP] remove duplicate function for finding a splat shuffle; NFC	2020-05-11 16:36:07 -04:00
Sam McCall	728cf6d86b	Revert "[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression" This reverts commit `3c44c441db`. Causes infloops on some inputs, see https://reviews.llvm.org/D77319 for repro	2020-05-11 16:44:01 +02:00
Djordje Todorovic	45e5a32a8b	[NFC][DwarfDebug] Avoid default capturing when using lambdas It is bad practice to capture by default (via [&] in this case) when using lambdas, so we should avoid that as much as possible. This patch fixes that in the getForwardingRegsDefinedByMI from DwarfDebug module. Differential Revision: https://reviews.llvm.org/D79616	2020-05-11 10:02:13 +02:00
Djordje Todorovic	2552dc5317	[NFC][DwarfDebug] Prefer explicit to auto type deduction We should use explicit type instead of auto type deduction when the type is so obvious. In addition, we remove ambiguity, since auto type deduction sometimes is not that intuitive, so that could lead us to some unwanted behavior. This patch fixes that in the collectCallSiteParameters() from DwarfDebug module. Differential Revision: https://reviews.llvm.org/D79624	2020-05-11 09:12:58 +02:00
QingShan Zhang	3c44c441db	[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression. However, during negating the expression, the cost might change as we are changing the DAG, and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore. This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression, and check the cost during negating the expression. It also reduce the duplicated code between getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638 Reviewed By: RKSimon, spatel Differential Revision: https://reviews.llvm.org/D77319	2020-05-11 02:41:10 +00:00
Matt Arsenault	3af85fa8f0	GlobalISel: Handle more cases in lowerUnmergeValues Handle scalar sources, as well as vectors.	2020-05-09 19:33:32 -04:00
Craig Topper	24b3c2d058	[BreakFalseDeps] Harden pickBestRegisterForUndef against changing tied operands or physical registers that aren't renamable. I don't have any test cases since X86 doesn't return any tied operands from getUndefRegClearance today. But conceivably we could want BreakFalseDeps to insert a dependency breaking XOR for a tied operand in the future.	2020-05-09 15:37:31 -07:00
Matt Arsenault	69999605ee	GlobalISel: Move code into lowering for G_MERGE_VALUES Currently this code exists in widenScalar for G_MERGE_VALUE sources. I'm not sure if the existing expansion in widenScalar should be removed or not. The widenScalar variant tries to extend to the requested size, but this just uses the original bitwidth.	2020-05-09 16:39:37 -04:00
Craig Topper	bebdc62c3f	[SelectionDAG] Remove ConstantPoolSDNode::getAlignment. Use getAlign instead. Differential Revision: https://reviews.llvm.org/D79459	2020-05-08 16:04:11 -07:00
Craig Topper	d1119980e5	[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode. This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign. Removing getAlignment() will be done as a follow up. Differential Revision: https://reviews.llvm.org/D79436	2020-05-08 16:04:11 -07:00
Jessica Paquette	f66309deab	[GlobalISel] Don't add duplicate successors to MBBs when translating indirectbr This fixes a verifier failure on a bot: http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/ ``` * Bad machine code: MBB has duplicate entries in its successor list. * - function: foo - basic block: %bb.5 indirectgoto (0x7fe3d687ca08) ``` One of the GCC torture suite tests (pr70460.c) has an indirectbr instruction which has duplicate blocks in its destination list. According to the langref this is allowed: > Blocks are allowed to occur multiple times in the destination list, though > this isn’t particularly useful. (https://www.llvm.org/docs/LangRef.html#indirectbr-instruction) We don't allow this in MIR. So, when we translate such an instruction, the verifier screams. This patch makes `translateIndirectBr` check if a successor has already been added to a block. If the successor is present, it is skipped rather than added twice. Differential Revision: https://reviews.llvm.org/D79609	2020-05-08 13:40:02 -07:00
Wei Mi	aa2ddfc73d	[SampleFDO] For functions without profiles, provide an option to put them in a special text section. For sampleFDO, because the optimized build uses profile generated from previous release, previously we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persuing the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. In https://reviews.llvm.org/D66374, we introduced profile symbol list to discriminate functions being cold versus functions being newly added. This mechanism works quite well for regular use cases in AutoFDO. However, in some case, we can only have a partial profile when optimizing a target. The partial profile may be an aggregated profile collected from many targets. The profile symbol list method used for regular sampleFDO profile is not applicable to partial profile use case because it may be too large and introduce many false positives. To solve the problem for partial profile use case, we provide an option called --profile-unknown-in-special-section. For functions without profile, we will still treat them conservatively in compiler optimizations -- for example, treat them as warm instead of cold in inliner. When we use profile info to add section prefix for functions, we will discriminate functions known to be not cold versus functions without profile (being unknown), and we will put functions being unknown in a special text section called .text.unknown. Runtime system will have the flexibility to decide where to put the special section in order to achieve a balance between performance and memory saving. Differential Revision: https://reviews.llvm.org/D62540	2020-05-08 11:18:09 -07:00
Simon Pilgrim	70293ba26f	[DAG] SimplifyMultipleUseDemandedBits - remove superfluous bitcasts If the SimplifyMultipleUseDemandedBits calls BITCASTs that peek through back to the original type then we can remove the BITCASTs entirely. Differential Revision: https://reviews.llvm.org/D79572	2020-05-08 19:04:49 +01:00
Fangrui Song	befbc99a7f	Reland D79501 "[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units." With a fix to uninitialized EndOffset. DW_OP_call_ref is the only operation that has an operand which depends on the DWARF format. The patch fixes handling that operation in DWARF64 units. Differential Revision: https://reviews.llvm.org/D79501	2020-05-08 09:35:54 -07:00
Krasimir Georgiev	c5e0967e4c	Revert "[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units." This reverts commit `989ae9e848`. Newly added test fails: FAIL: LLVM::DW_OP_call_ref_unexpected.s http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/28298	2020-05-08 17:24:32 +02:00
Simon Pilgrim	9f726376e3	LiveIntervalCalc - remove unnecessary includes. NFC. As we're inheriting from LiveRangeCalc, all the headers are already explicitly required by LiveRangeCalc.h	2020-05-08 14:57:35 +01:00
Igor Kudrin	989ae9e848	[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units. DW_OP_call_ref is the only operation that has an operand which depends on the DWARF format. The patch fixes handling that operation in DWARF64 units. Differential Revision: https://reviews.llvm.org/D79501	2020-05-08 15:14:42 +07:00
aartbik	771d30c647	[llvm] [CodeGen] Fixed vector halving bug for masked store Summary: Note that this fix is very similar to what has already been done for the masked load in https://reviews.llvm.org/D78608 Bugs: https://bugs.llvm.org/show_bug.cgi?id=45563 https://bugs.llvm.org/show_bug.cgi?id=45833 Reviewers: craig.topper, nicolasvasilache, mehdi_amini Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79611	2020-05-07 19:01:40 -07:00
James Y Knight	7af9d386da	Correctly modify the CFG in IfConverter, and then remove the CorrectExtraCFGEdges function. The latter was a workaround for "Various pieces of code" leaving bogus extra CFG edges in place. Where by "various" it meant only IfConverter::MergeBlocks, which failed to clear all of the successors of dead blocks it emptied out. This wouldn't matter a whole lot, except that the dead blocks remained listed as predecessors of still-useful blocks, inhibiting optimizations. This fix slightly changed two thumb tests, because the correct CFG successors allowed for the "diamond" if-conversion pattern to be detected, when it could only use "simple" before. Additionally, the removal of a now-redundant call to analyzeBranch (with AllowModify=true) in BranchFolder::OptimizeFunction caused a later check for an empty block in BranchFolder::OptimizeBlock to fail. Correct this by moving the call to analyzeBranch in OptimizeBlock higher. Differential Revision: https://reviews.llvm.org/D79527	2020-05-07 18:17:07 -04:00
Hiroshi Yamauchi	1b4e3def03	[BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare. Summary: This helps detect some missed BFI updates during CodeGenPrepare. This is debug build only and disabled behind a flag. Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts(). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77417	2020-05-07 11:58:00 -07:00
Thomas Raoux	dc26dec331	[ModuloSchedule] Fix epilogue peeling with illegal phi. When peeling out the epilogue we need to ignore illegal phis coming from stages greater than the producer stage. Otherwise we end up with circular phi dependencies. Differential Revision: https://reviews.llvm.org/D79581	2020-05-07 10:04:05 -07:00
Kerry McLaughlin	a31f4c52bf	[SVE][CodeGen] Fix legalisation for scalable types Summary: This patch handles illegal scalable types when lowering IR operations, addressing several places where the value of isScalableVector() is ignored. For types such as <vscale x 8 x i32>, this means splitting the operations. In this example, we would split it into two operations of type <vscale x 4 x i32> for the low and high halves. In cases such as <vscale x 2 x i32>, the elements in the vector will be promoted. In this case they will be promoted to i64 (with a vector of type <vscale x 2 x i64>) Reviewers: sdesmalen, efriedma, huntergr Reviewed By: efriedma Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78812	2020-05-07 10:01:31 +01:00
Craig Topper	7b9d6673bf	[SelectionDAG] When splitting gather operands in type legalization, set MMO size to UnknownSize I missed this case when I did the same for gather results and scatter operands in `c69a4d6bef`.	2020-05-06 19:57:14 -07:00
Alexandre Ganea	f78b674de4	Revert "[Debug][CodeView] Emit fully qualified names for globals" This reverts commit `06591b6d19`.	2020-05-06 15:23:58 -04:00
LemonBoy	7fa5abd343	[SelectionDAG] Fix assertion failure with big shift amounts Calling getShiftAmountTy with LegalTypes set may return a type that's too narrow to hold the shift amount for integer type it's applied to. Fixes the regression introduced by D79096 Differential Revision: https://reviews.llvm.org/D79405	2020-05-06 11:58:37 -07:00
Michael Liao	6533c1da7f	Revert "[MIR] Fix a bug in MIR printer." This reverts commit `e38018b80d`.	2020-05-06 11:26:42 -04:00
Michael Liao	e38018b80d	[MIR] Fix a bug in MIR printer. - Need to skip the assignment of `ID`, which is used to index that two object arrays.	2020-05-06 10:33:45 -04:00
Sanjay Patel	2f1fe1864d	[DAGCombiner] sink target-supported FP<->int cast op after concat vectors Try to combine N short vector cast ops into 1 wide vector cast op: concat (cast X), (cast Y)... -> cast (concat X, Y...) This is part of solving PR45794: https://bugs.llvm.org/show_bug.cgi?id=45794 As noted in the code comment, this is uglier than I was hoping because the opcode determines whether we pass the source or destination type to isOperationLegalOrCustom(). Also IIUC, there's no way to validate what the other (dest or src) type is. Without the extra legality check on that, there's an ARM regression test in: test/CodeGen/ARM/isel-v8i32-crash.ll ...that will crash trying to lower an unsupported v8f32 to v8i16. Differential Revision: https://reviews.llvm.org/D79360	2020-05-06 10:25:58 -04:00
Alexandre Ganea	06591b6d19	[Debug][CodeView] Emit fully qualified names for globals Emit S_[L\|G][THREAD32\|DATA32] records with a fully qualified name (namespace + class scope). Differential Revision: https://reviews.llvm.org/D79447	2020-05-06 09:12:00 -04:00
David Spickett	055ea585c7	Reland "[CodeGen] Make logic of CCState::resultsCompatible clearer" This relands commit `d782d1f898`. With a typo fixed, which was causing the x86 test failure.	2020-05-06 13:40:49 +01:00
David Spickett	e1022cb5d4	Revert "[CodeGen] Make logic of CCState::resultsCompatible clearer" This reverts commit `d782d1f898` which caused test CodeGen/X86/sibcall.ll to fail.	2020-05-06 10:14:17 +01:00
David Spickett	d782d1f898	[CodeGen] Make logic of CCState::resultsCompatible clearer	2020-05-06 09:48:58 +01:00
Konstantin Schwarz	e82b0e9a8e	[GlobalISel][InlineAsm] Add support for basic output operand constraints Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette Reviewed By: arsenm Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78318	2020-05-06 10:06:13 +02:00
Puyan Lotfi	0c4aab27b3	[NFC] Outliner label name clean up. Just simplifying how the label name is generated while using std::to_string instead of Twine. Differential Revision: https://reviews.llvm.org/D79464	2020-05-05 23:27:46 -04:00
Jinsong Ji	80b78a47e5	[MachinePipeliner] Add ORE for MachinePipeliner This patch adds ORE for MachinePipeliner, so that people can anaylyze their code using opt-viewer or other tools, then optimize the code to catch more piplining opportunities. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D79368	2020-05-05 16:04:53 +00:00
Sam Parker	40574fefe9	[NFC][CostModel] Add TargetCostKind to relevant APIs Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002	2020-05-05 10:35:54 +01:00
David Sherwood	cd3a54c55a	[CodeGen] Fix warnings due to SelectionDAG::getSplatSourceVector Summary: I have fixed several places in getSplatSourceVector and isSplatValue to work correctly with scalable vectors. I added new support for the ISD::SPLAT_VECTOR DAG node as one of the obvious cases we can support with scalable vectors. In other places I have tried to do the sensible thing, such as bail out for vector types we don't yet support or don't intend to support. It's not possible to add IR test cases to cover these changes, since they are currently only ever exercised on certain targets, e.g. only X86 targets use the result of getSplatSourceVector. I've assumed that X86 tests already exist to test these code paths for fixed vectors. However, I have added some AArch64 unit tests that test the specific functions I have changed. Differential revision: https://reviews.llvm.org/D79083	2020-05-05 08:45:41 +01:00
Krzysztof Parzyszek	156092bbcc	[RegisterCoalescer] Extend a subrange if needed when filling range gap Register live ranges may have had gaps that after coalescing should be removed. This is done by adding a new segment to the range, and merging it with neighboring segments. When doing so, do not assume that each subrange of the register ended at the same index. If a subrange ended earlier, adding this segment could make the live range invalid. Instead, if the subrange is not live at the start of the segment, extend it first.	2020-05-04 16:49:59 -05:00
Snehasish Kumar	c8ac29ab1d	Descriptive symbol names for machine basic block sections. Today symbol names generated for machine basic block sections use a unary encoding to reduce bloat. This is essential when every basic block in the binary is assigned a symbol however with basic block clusters (rG05192e585ce175b55f2a26b83b4ed7882785c8e6) when we only need to generate a few non-temporary symbols we can assign more descriptive names making them more user friendly. With this change - Cold cluster section for function foo is named "foo.cold" Exception cluster section for function foo is named "foo.eh" Other cluster sections identified by their ids are named "foo.ID" Using this format works well with existing tools. It will demangle as expected and works with existing symbolizers, profilers and debuggers out of the box. $ c++filt _Z3foov.cold foo() [clone .cold] $ c++filt _Z3foov.eh foo() [clone .eh] $c++filt _Z3foov.1234 foo() [clone 1234] Tests for basicblock-sections are updated with some cleanup where appropriate. Differential Revision: https://reviews.llvm.org/D79221	2020-05-04 19:06:43 +00:00
Alexandre Ganea	721ea5b380	[DebugInfo][CodeView] Include namespace into emitted globals Before this patch, global variables didn't have their namespace prepended in the Codeview debug symbol stream. This prevented Visual Studio from displaying them in the debugger (they appeared as 'unspecified error') Differential Revision: https://reviews.llvm.org/D79028	2020-05-04 13:59:36 -04:00
Alex Richardson	d1ff003fbb	[SelectionDAGBuilder] Stop setting alignment to one for hidden sret values We allocated a suitably aligned frame index so we know that all the values have ABI alignment. For MIPS this avoids using pair of lwl + lwr instructions instead of a single lw. I found this when compiling CHERI pure capability code where we can't use the lwl/lwr unaligned loads/stores and and were to falling back to a byte load + shift + or sequence. This should save a few instructions for MIPS and possibly other backends that don't have fast unaligned loads/stores. It also improves code generation for CodeGen/X86/pr34653.ll and CodeGen/WebAssembly/offset.ll since they can now use aligned loads. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78999	2020-05-04 14:44:39 +01:00
Ten Tzen	21c1a0c730	Test Commit: add two head comments in WinEHPrepare.cpp This is a Test commit.	2020-05-03 01:15:59 -07:00
LemonBoy	6d103ca855	[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces. The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines. Differential Revision: https://reviews.llvm.org/D79096	2020-05-02 15:18:10 -07:00
Simon Pilgrim	a09a3c6d3e	Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call" Causing buildbot failures	2020-05-02 20:08:33 +01:00
Simon Pilgrim	8e05ac0a51	[DAGCombine] visitTRUNCATE - remove GetDemandedBits call rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).	2020-05-02 19:52:17 +01:00
Benjamin Kramer	97f92261df	[MBP] tuple->pair. NFC. std::pair has a trivial copy ctor, std::tuple doesn't.	2020-05-02 20:23:34 +02:00
Sam McCall	d10c995b4d	std::isspace -> llvm::isSpace (where locale should be ignored) I've left out some cases where I wasn't totally sure this was right or whether the include was ok (compiler-rt) or idiomatic (flang).	2020-05-02 15:36:04 +02:00
Simon Pilgrim	7cb5a51f38	[DAG] SimplifyDemandedVectorElts - add INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling	2020-05-01 16:20:51 +01:00
Simon Pilgrim	65d32a9892	[DAG] SimplifyDemandedVectorElts - remove INSERT_SUBVECTOR if we don't demand the subvector	2020-05-01 16:20:51 +01:00
Simon Pilgrim	e3c0be596c	[DAG] SimplifyDemandedVectorElts - add EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling	2020-05-01 13:48:07 +01:00
Craig Topper	6a1ad76dab	[X86] Don't return true from isTruncateFree for vectors Also fix some cost tables for vXi1 types to match the costs entries for the types they will be promoted to. Differential Revision: https://reviews.llvm.org/D79045	2020-04-30 16:43:35 -07:00
Benjamin Kramer	31db4dbbbe	Clean up warnings after `a2c8cd1812`	2020-04-30 17:01:30 +02:00
diggerlin	a2c8cd1812	[AIX] emit .extern and .weak directive linkage SUMMARY: emit .extern and .weak directive linkage Reviewers: hubert.reinterpretcast, Jason Liu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D76932	2020-04-30 09:54:10 -04:00
Simon Pilgrim	96238486ed	[DAGCombine] Move the remaining X86 funnel shift patterns to DAGCombine X86 matches several 'shift+xor' funnel shift patterns: fold (or (srl (srl x1, 1), (xor y, 31)), (shl x0, y)) -> (fshl x0, x1, y) fold (or (shl (shl x0, 1), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y) fold (or (shl (add x0, x0), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y) These patterns are also what we end up with the proposed expansion changes in D77301. This patch moves these to DAGCombine's generic MatchFunnelPosNeg. All existing X86 test cases still pass, and we just have a small codegen change in pr32282.ll. Reviewed By: @spatel Differential Revision: https://reviews.llvm.org/D78935	2020-04-30 12:57:17 +01:00
Simon Pilgrim	6547a5ceb2	[DAG] Add TODO comment regarding ADD(X,X) -> SHL(X,1) canonicalization As discussed on D78935	2020-04-30 12:57:16 +01:00
David Sherwood	058cd8c5be	[CodeGen] Add support for inserting elements into scalable vectors Summary: This patch tries to ensure that we do something sensible when generating code for the ISD::INSERT_VECTOR_ELT DAG node when operating on scalable vectors. Previously we always returned 'undef' when inserting an element into an out-of-bounds lane index, whereas now we only do this for fixed length vectors. For scalable vectors it is assumed that the backend will do the right thing in the same way that we have to deal with variable lane indices. In this patch I have permitted a few basic combinations for scalable vector types where it makes sense, but in general avoided most cases for now as they currently require the use of BUILD_VECTOR nodes. This patch includes tests for all scalable vector types when inserting into lane 0, but I've only included one or two vector types for other cases such as variable lane inserts. Differential Revision: https://reviews.llvm.org/D78992	2020-04-30 11:14:04 +01:00
Puyan Lotfi	ffd5e121d7	[NFCi] Iterative Outliner + clang-format refactoring. Prior to D69446 I had done some NFC cleanup to make landing an iterative outliner a cleaner more straight-forward patch. Since then, it seems that has landed but I noticed some ways it could be cleaned up. Specifically: 1) doOutline was meant to be the re-runable function, but instead runOnceOnModule was created that just calls doOutline. 2) In D69446 we discussed that the flag allowing the re-run of the outliner should be a flag to tell how many additional times to run the outliner again, not the total number of times. I don't think it makes sense to introduce a flag, but print an error if the flag is set to 0. This is an NFCi, the i being that I get rid of the way that the machine-outline-runs flag could be used to tell the outliner to not run at all, and because I renamed the flag to '-machine-outliner-reruns'. Differential Revision: https://reviews.llvm.org/D79070	2020-04-29 18:36:47 -04:00
Davide Italiano	dcdb1b94e1	[MachineVerifier] Remove an unused function. NFCI.	2020-04-29 09:58:27 -07:00
Simon Pilgrim	1be7f2de1b	Revert rG5c4b4a62256876 "PseudoSourceValue.h - reduce GlobalValue.h include to forward declaration. NFC." Causes buildbot failures.	2020-04-29 16:12:19 +01:00
Simon Pilgrim	5c4b4a6225	PseudoSourceValue.h - reduce GlobalValue.h include to forward declaration. NFC. Fix MachineMemOperand.h implicit dependency on Type.h via PseudoSourceValue.h	2020-04-29 15:39:27 +01:00
QingShan Zhang	b5f89744cc	[DAGCombine] Checking the cost directly to improve the code readability Call getNegatedExpression(Cost) and check the Cost to make the code more clear. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D78347	2020-04-29 01:49:39 +00:00
Casey Carter	68b30bc02b	[NFC] Correct spelling of "ambiguous"	2020-04-28 14:51:37 -07:00
Krzysztof Parzyszek	25a4b1904c	Handle part-word LL/SC in atomic expansion pass Differential Revision: https://reviews.llvm.org/D77213	2020-04-28 10:07:39 -05:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Craig Topper	e13c141a91	[SelectionDAGBuilder] Use CallBase::isInlineAsm in a couple places. NFC These lines were just changed from using CallBase::getCalledValue to getCallledOperand. Go aheand change them to isInlineAsm.	2020-04-27 23:00:44 -07:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
LemonBoy	f30416fdde	[AsmPrinter] Fix emission of non-standard integer constants for BE targets The code assumed that zero-extending the integer constant to the designated alloc size would be fine even for BE targets, but that's not the case as that pulls in zeros from the MSB side while we actually expect the padding zeros to go after the LSB. I've changed the codepath handling the constant integers to use the store size for both small(er than u64) and big constants and then add zero padding right after that. Differential Revision: https://reviews.llvm.org/D78011	2020-04-27 14:57:29 -07:00
Nick Desaulniers	59acdf0aca	fix D78849 for g++ < 7.1 Summary: Looks like g++ < 7.1 has a bug resolving calls to member functions without `this->` in lamdas with `auto` types. It looks like multiple build bots are using g++-5. https://stackoverflow.com/questions/32097759/calling-this-member-function-from-generic-lambda-clang-vs-gcc https://godbolt.org/z/MiaRt- Reviewers: MaskRay, efriedma, jyknight, craig.topper, rsmith Reviewed By: rsmith Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78962	2020-04-27 13:47:00 -07:00
Wei Mi	68d2301e12	Recommit "Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore" Insert .cfi_offset/.cfi_register when IncomingCSRSaved of current block is larger than OutgoingCSRSaved of its previous block. Original commit message: https://reviews.llvm.org/D42848 only handled CFA related cfi directives but didn't handle CSR related cfi. The patch adds the CSR part. Basically it reuses the framework created in D42848. For each basicblock, the patch tracks which CSR set have been saved at its CFG predecessors's exits, and compare the CSR set with the set at its previous basicblock's exit (The previous block is the block laid before the current block). If the saved CSR set at its previous basicblock's exit is larger, .cfi_restore will be inserted. The patch also generates proper .cfi_restore in epilogue to make sure the saved CSR set is consistent for the incoming edges of each block. Differential Revision: https://reviews.llvm.org/D74303	2020-04-27 12:46:58 -07:00
Nick Desaulniers	c695ea2afa	[MachineVerifier] retrofit iterators with range for. NFC Summary: Reviewing failures identified in D78586, I was finding the identifiers for these iterators hard to read. Reviewers: efriedma, MaskRay, jyknight Reviewed By: MaskRay Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78849	2020-04-27 12:15:55 -07:00
Davide Italiano	c8433a5b1b	[GlobalISel] Remove debug locations when emitting constants. The tl;dr story is that this causes jumps in the emitted line tables, even at `-O0`. We could at some point consider more fancy solutions to preserve locations, but it doesn't seem to be worth the effort for now. <rdar://problem/62460788> Differential Revision: https://reviews.llvm.org/D78947	2020-04-27 11:27:08 -07:00
David Sherwood	096b25a8d8	[CodeGen] Use SPLAT_VECTOR for zeroinitialiser with scalable types Summary: When generating code for the LLVM IR zeroinitialiser operation, if the vector type is scalable we should be using SPLAT_VECTOR instead of BUILD_VECTOR. Differential Revision: https://reviews.llvm.org/D78636	2020-04-27 15:57:59 +01:00
QingShan Zhang	2957fa0cd1	[NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression() to have it return null if the cost is expensive, and add some helper function for easy to use. And rename the old getNegatedExpression to negateExpression to avoid the semantic conflict. Reviewed By: RKSimon Differential revision: https://reviews.llvm.org/D78291	2020-04-27 04:11:42 +00:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Benjamin Kramer	1d42764df7	Give helpers internal linkage. NFC.	2020-04-25 11:50:52 +02:00
Snehasish Kumar	0cc063a8ff	Use .text.unlikely and .text.eh prefixes for MachineBasicBlock sections. Summary: Instead of adding a ".unlikely" or ".eh" suffix for machine basic blocks, this change updates the behaviour to use an appropriate prefix instead. This allows lld to group basic block sections together when -z,keep-text-section-prefix is specified and matches the behaviour observed in gcc. Reviewers: tmsriram, mtrofin, efriedma Reviewed By: tmsriram, efriedma Subscribers: eli.friedman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78742	2020-04-24 15:07:38 -07:00
Fangrui Song	10bc12588d	[XRay] Change Sled.Function to PC-relative for sled version 2 and make llvm-xray support sled version 2 addresses Follow-up of D78082 and D78590. Otherwise, because xray_instr_map is now read-only, the absolute relocation used for Sled.Function will cause a text relocation.	2020-04-24 14:41:56 -07:00
Amara Emerson	dbb0356771	[AArch64][GlobalISel] Fix sub-64b stack parameter passing on Darwin. A previous bug fix for varargs introduced a regression where we would incorrectly widen some stores to memory when passing i8/i16 parameters on the stack. This didn't show up seemingly because it only happens when there is no signext/zeroext parameter attribute, which I think for Darwin clang adds. Swift however seems to be a different story, and a plain anyext on the parameter triggered the bug. To fix this, I've added a new ValueHandler::assignValueToAddress type override which lets us distiguish between varargs and fixed args (we still need this widening behaviour for varargs to fix the original bug in 2018). rdar://61353552	2020-04-24 13:56:43 -07:00
Jean-Michel Gorius	505685a67a	[llvm][CodeGen] Check for memory instructions when querying for alias status Summary: Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction parameters does not access memory. This prevents calls to TargetInstrInfo::areMemAccessesTriviallyDisjoint with incompatible instructions. A side effect of this change is to render the mayAlias helper in the AArch64 load/store optimizer obsolete. We can now directly call the MachineInstr::mayAlias member function. Reviewers: hfinkel, t.p.northover, mcrosier, eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78823	2020-04-24 22:54:46 +02:00
Simon Pilgrim	628b0243c8	AllocationOrder.h - split MCRegisterInfo.h include. NFC. We only require to include MCRegister.h and SmallVector.h.	2020-04-24 18:42:43 +01:00
Fangrui Song	25e22613df	[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address) Follow-up of D78082 (x86-64). This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le. MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined. Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well. Tested on AArch64 Linux and ppc64le Linux. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D78590	2020-04-24 08:35:43 -07:00
Simon Pilgrim	f10835a034	DwarfDebug.h - remove unnecessary forward declarations. NFC. We include their headers already.	2020-04-24 15:34:54 +01:00
aartbik	907871d9ad	[llvm] [CodeGen] Fixed vector halving bug for masked load Summary: Given a VL=14 that is enveloped by a proper VL=16, splitting the masked load using the enveloping halving VL=8/8 should yields should eventually yield V=8/5. This fixes various assert failures in getHalfNumVectorElementsVT() and IncrementMemoryAddress(). Note, I suspect similar fixes will be needed for other masked operations, but for now I send out a fix for masked load only. Bugzilla issue 45563 https://bugs.llvm.org/show_bug.cgi?id=45563 Reviewers: craig.topper, mehdi_amini, nicolasvasilache Reviewed By: craig.topper Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78608	2020-04-23 15:12:44 -07:00
Christopher Tetreault	ccd623eae3	[SVE] Remove calls to isScalable from CodeGen Reviewers: efriedma, sdesmalen, stoklund, sunfish Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77755	2020-04-23 12:58:52 -07:00
Alex Richardson	bbcfce4bad	Use FrameIndexTy for stack protector Using getValueType() is not correct for architectures extended with CHERI since we need a pointer type and not the value that is loaded. While stack protector is useless when you have CHERI (since CHERI provides much stronger security guarantees), we still have a test to check that we can generate correct code for checks. Merging `b281138a1b` into our tree broke this test. Fix by using TLI.getFrameIndexTy(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D77785	2020-04-23 13:12:27 +01:00
Amara Emerson	613f12dd8e	[AArch64][GlobalISel] Set the current debug loc when missing in some cases.	2020-04-23 01:34:57 -07:00
Aditya Nandakumar	3db893b371	[GISel]: Relax opcode checking at the top level to enable CSE Loosen the restriction on what kinds of opcodes can be CSEd as targets may want to CSE some generic target specific pseudos. NFC as far as this change is concerned as CSEConfig still pretty much is a subset of this check. Differential Revision: https://reviews.llvm.org/D78684	2020-04-22 17:31:33 -07:00
Vedant Kumar	f0b52beef3	[AArch64InstrInfo] Ignore debug insts in areCFlagsAccessedBetweenInstrs [7/14] Summary: Fix an issue where the presence of debug info could disable a peephole optimization due to areCFlagsAccessedBetweenInstrs returning the wrong result. In test/CodeGen/AArch64/arm64-csel.ll, the issue was found in the function @foo5, in which the first compare could successfully be optimized but not the second. Reviewers: t.p.northover, eastig, paquette Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, dsanders, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78157	2020-04-22 17:03:40 -07:00
Vedant Kumar	26271c8384	[AArch64InstrInfo] Ignore debug insts in canInstrSubstituteCmpInstr [6/14] Summary: Fix an issue where the presence of debug info could disable a peephole optimization in optimizeCompareInstr due to canInstrSubstituteCmpInstr returning the wrong result. Depends on D78137. Reviewers: t.p.northover, eastig, paquette Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, llvm-commits, dsanders Tags: #llvm Differential Revision: https://reviews.llvm.org/D78151	2020-04-22 17:03:40 -07:00
Vedant Kumar	f1a71b5949	[GIsel][LegalizerHelper] Account for debug insts when creating mem libcalls [5/14] Summary: While lowering memory intrinsics, GIsel attempts to form a tail call to a library routine. There might be a DBG_LABEL or something after the intrinsic call, though: in that case, GIsel should still be able to form the tail call, and should also delete the debug insts after the tail call as the transform makes them invalid. Reviewers: dsanders, aemerson Subscribers: hiraditya, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78335	2020-04-22 17:03:40 -07:00
Vedant Kumar	ba9db54505	[GIsel][CombinerHelper] Fix for missed ElideBrByInvertingCond/CombineIndexedLoadStore combines [4/14] Summary: Fix an issue which could result in ElideBrByInvertingCond or CombineIndexedLoadStore being missed when debug info is present. In both cases the fix is s/hasOneUse/hasOneNonDbgUse/. Reviewers: aemerson, dsanders Subscribers: hiraditya, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78254	2020-04-22 17:03:40 -07:00
Vedant Kumar	5c04274dab	[GIsel][CombinerHelper] Don't consider debug insts in dominance queries [3/14] Summary: This fixes several issues where the presence of debug instructions could disable certain combines, due to dominance queries finding uses/defs that don't actually exist. Reviewers: dsanders, fhahn, paquette, aemerson Subscribers: hiraditya, arphaman, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78253	2020-04-22 17:03:40 -07:00
Vedant Kumar	5bae277584	[GISel][RegBankSelect] Hide assertion failure from LLT::getScalarSizeInBits [2/14] Summary: It looks like RegBankSelect can try to assign a bank based on a DBG_VALUE instead of ignoring it. This eventually leads to an assert in AArch64RegisterBankInfo::getInstrMapping because there is some info missing from the DBG_VALUE MachineOperand (I see: `Assertion failed: (RawData != 0 && "Invalid Type"), function getScalarSizeInBits`). I'm not 100% sure it's safe to insert DBG_VALUE instructions right before RegBankSelect (that's what -debugify-and-strip-all-safe is doing). Any advice appreciated. Depends on D78135. Reviewers: ab, qcolombet, dsanders, aprantl Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78137	2020-04-22 17:03:39 -07:00
Vedant Kumar	10ce1bc8d0	[MachineBasicBlock] Add helpers for skipping debug instructions [1/14] Summary: These helpers are exercised by follow-up commits in this patch series, which is all about removing CodeGen differences with vs. without debug info in the AArch64 backend. Reviewers: fhahn, aprantl, jpaquette, paquette Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78260	2020-04-22 17:03:39 -07:00
Vedant Kumar	2a5675f11d	[MachineDebugify] Insert synthetic DBG_VALUE instructions Summary: Teach MachineDebugify how to insert DBG_VALUE instructions. This can help find bugs causing CodeGen differences when debug info is present. DBG_VALUE instructions are only emitted when -debugify-level is set to locations+variables. There is essentially no attempt made to match up DBG_VALUE register operands with the local variables they ought to correspond to. I'm not sure how to improve the situation. In some cases (MachineMemOperand?) it's possible to find the IR instruction a MachineInstr corresponds to, but in general this seems to call for "undoing" the work done by ISel. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78135	2020-04-22 17:03:39 -07:00
Mark Lacey	328bb446dd	Add a policy to enable computing SchedDFSResult. Summary: Make GenericScheduler compute SchedDFSResult on initialization if the policy is set. This makes it possible to create classes that extend GenericScheduler and rely on the results of SchedDFSResult, e.g. to perform subtree scheduling. NFC unless the policy is set. Subscribers: MatzeB, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78432	2020-04-22 16:36:11 -07:00
Eli Friedman	1a78b0bd38	[MachineOutliner] Teach outliner to set live-ins Preserving liveness can be useful even late in the pipeline, if we're doing substantial optimization work afterwards. (See, for example, D76065.) Teach MachineOutliner how to correctly set live-ins on the basic block in outlined functions. Differential Revision: https://reviews.llvm.org/D78605	2020-04-22 14:19:26 -07:00
Puyan Lotfi	264c07ef77	[llvm][MIRVRegNamer] Avoid collisions across jump table indices. Hash Jump Table Indices uniquely within a basic block for MIR Canonicalizer / MIR VReg Renamer passes. Differential Revision: https://reviews.llvm.org/D77966	2020-04-22 14:58:44 -04:00
Christopher Tetreault	2dea3f1298	[SVE] Add new VectorType subclasses Summary: Introduce new types for fixed width and scalable vectors. Does not remove getNumElements yet so as to not break code during transition period. Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr Reviewed By: sdesmalen Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D77587	2020-04-22 08:59:01 -07:00
Simon Pilgrim	fc044530f7	BranchFolding.h - remove unused raw_ostream forward declaration. NFC.	2020-04-22 15:07:18 +01:00
Simon Pilgrim	c3730ad8fc	[AsmPrinter] Remove unused forward declarations. NFC.	2020-04-22 14:01:52 +01:00
Craig Topper	05a11974ae	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-22 00:07:13 -07:00
Eli Friedman	46a52ff9ed	[TargetPassConfig] Run MachineVerifier after more passes. We were disabling verification for no reason in a bunch of places; just turn it on. At this point, there are two key places where we don't run verification: during register allocation, and after addPreEmitPass. Regalloc probably isn't worth messing with; it has its own invariants, and verifying afterwards is probably good enough. For after addPreEmitPass, it's probably worth investigating improvements.	2020-04-21 21:05:07 -07:00
Fangrui Song	c5d38924dc	[XRay] xray_fn_idx: set SHF_WRITE to avoid text relocations In a future change we should properly fix xray_fn_idx to use PC-relative addresses as well, but for now let's keep absolute addresses until sled addresses are all fixed.	2020-04-21 12:02:29 -07:00
Ana Pazos	66590e1e9e	[MC][PGO][PGSO] Cleanup unused MBFI in AsmPrinter Summary: Machine Block Frequency Info (MBFI) is being computed but unused in AsmPrinter. MBFI computation was introduced with PGO change D71149 and then its use was removed in D71106. No need to keep computing it. Reviewers: MaskRay, jyknight, skan, yamauchi, davidxl, efriedma, huihuiz Reviewed By: MaskRay, skan, yamauchi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78526	2020-04-21 10:01:56 -07:00
Fangrui Song	5771c98562	[XRay] Change xray_instr_map sled addresses from absolute to PC relative for x86-64 xray_instr_map contains absolute addresses of sleds, which are relocated by `R_*_RELATIVE` when linked in -pie or -shared mode. By making these addresses relative to PC, we can avoid the dynamic relocations and remove the SHF_WRITE flag from xray_instr_map. We can thus save VM pages containg xray_instr_map (because they are not modified). This patch changes x86-64 and bumps the sled version to 2. Subsequent changes will change powerpc64le and AArch64. Reviewed By: dberris, ianlevesque Differential Revision: https://reviews.llvm.org/D78082	2020-04-21 09:36:09 -07:00
Nick Desaulniers	d3fdafae06	[InlineSpiller] simplify insertReload() NFC Summary: The repeated use of std::next() on a MachineBasicBlock::iterator was clever, but we only need to reconstruct the iterator post creation of the spill instruction. This helps simplifying where we plan to place the spill, as discussed in D77849. From here, we can simplify the code a little by flipping the return code of a helper. Reviewers: efriedma Reviewed By: efriedma Subscribers: qcolombet, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78520	2020-04-21 08:31:20 -07:00
Fraser Cormack	c3a292961d	Let targets adjust physical output- and anti-deps Differential Revision: https://reviews.llvm.org/D78380	2020-04-21 13:45:03 +01:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Shengchen Kan	c031378ce0	[MC][NFC] Use camelCase style for functions in MCObjectStreamer	2020-04-20 20:09:20 -07:00
Andrew Litteken	1488bef8fc	[MachineOutliner] Annotation for outlined functions in AArch64 - Adding changes to support comments on outlined functions with outlining for the conditions through which it was outlined (e.g. Thunks, Tail calls) - Adapts the emitFunctionHeader to print out a comment next to the header if the target specifies it based on information in MachineFunctionInfo - Adds mir test for function annotiation Differential Revision: https://reviews.llvm.org/D78062	2020-04-20 13:33:31 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Simon Pilgrim	6cb204eb64	BranchFolding.h - cleanup includes and forward declarations. NFC. Push MBFIWrapper.h include down to BranchFolding.cpp/IfConversion.cpp	2020-04-20 15:59:39 +01:00
Simon Pilgrim	9036fcd25f	MIRVRegNamerUtils.h - remove unnecessary includes. NFC. Replace with forward declarations or push down to MIRVRegNamerUtils.cpp where necessary.	2020-04-20 15:59:39 +01:00
Konstantin Schwarz	12030494fc	[GlobalISel] Introduce InlineAsmLowering class Summary: Similar to the CallLowering class used for lowering LLVM IR calls to MIR calls, we introduce a separate class for lowering LLVM IR inline asm to MIR INLINEASM. There is no functional change yet, all existing tests should pass. Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette Reviewed By: aemerson Subscribers: gargaroff, wdng, mgorny, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78316	2020-04-20 15:10:18 +02:00
Kang Zhang	a8e15ee04a	[CodeGen] Support freeze expand for ppc_fp128 Summary: The patch D29014 has added the new ISD::FREEZE and can deal with the integer. The patch D76980 has added SoftenFloatRes_FREEZE for float point. But we still lack of expand for ppc_fp128, this will cause assertion for some cases. This patch is to support freeze expand for ppc_fp128. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78278	2020-04-20 07:27:41 +00:00
Simon Pilgrim	46de0d5fe9	SelectionDAGBuilder.h - remove unused includes + forward declarations. NFC. Replace SelectionDAG.h include with SelectionDAG forward declaration.	2020-04-19 12:38:41 +01:00
Simon Pilgrim	032738d17e	InstrEmitter.h - reduce SelectionDAG.h include to SelectionDAGNodes.h include. Add SDDbgLabel/TargetLowering forward declarations. Add the full SelectionDAG.h include to InstrEmitter.cpp.	2020-04-19 11:52:31 +01:00
LemonBoy	aad3d578da	[DebugInfo] Change DIEnumerator payload type from int64_t to APInt This allows the representation of arbitrarily large enumeration values. See https://lists.llvm.org/pipermail/llvm-dev/2017-December/119475.html for context. Reviewed By: andrewrk, aprantl, MaskRay Differential Revision: https://reviews.llvm.org/D62475	2020-04-18 12:49:31 -07:00
Simon Pilgrim	2333ea1e70	[cmake] LLVMMIRParser - add include/llvm/CodeGen/LLVMMIRParser header path Pick up the CodeGen/MIRParser headers in MSVC projects	2020-04-18 12:31:41 +01:00
Simon Pilgrim	5c16da387e	[cmake] LLVMGlobalISel - add include/llvm/CodeGen/GlobalISel header path Pick up the GlobalISel headers in MSVC projects	2020-04-18 12:31:40 +01:00
Andrew Litteken	8d5024f7fe	fix to outline cfi instruction when can be grouped in a tail call [MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining New test to check that we only outline CFI instruction if all CFI Instructions in the function would be captured by the outlining adding x86 tests analagous to AARCH64 cfi tests Revision: https://reviews.llvm.org/D77852	2020-04-17 22:26:34 -07:00
Daniel Sanders	14ad8dc076	Don't accidentally create MachineFunctions in mir-debugify/mir-strip-debugify We should only modify existing ones. Previously, we were creating MachineFunctions for externally-available functions. AFAICT this was benign in tree but ultimately led to asan bugs in our out of tree target.	2020-04-17 14:28:41 -07:00
Christopher Tetreault	c858debebc	Remove asserting getters from base Type Summary: Remove asserting vector getters from Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D77278	2020-04-17 14:03:31 -07:00
Daniel Sanders	701af684f6	[globalisel][legalizer] Expect to lose DebugLocs in dead code There's not really anything else that can be done with them. Fortunately, this dead code cleanup doesn't seem to trigger very often.	2020-04-17 13:45:44 -07:00
Daniel Sanders	5ef64bbf7a	[globalisel][legalizer] Include newly-dead code in artifact combine checks for DebugLoc loss This dead code deletion is part of the combine and the combine results should account for their locations.	2020-04-17 13:45:44 -07:00
Daniel Sanders	7f7f98b154	[globalisel][legalizer] Fix --verify-legalizer-debug-locs values It was using the enum class name, like so: =DebugLocVerifyLevel::None - No verification Changed it to: =none - No verification	2020-04-17 13:45:44 -07:00
Dominik Montada	55e3a7c6b2	[GlobalISel][AMDGPU] add legalization for G_FREEZE Summary: Copy the legalization rules from SelectionDAG: -widenScalar using anyext -narrowScalar using intermediate merges -scalarize/fewerElements using unmerge -moreElements using G_IMPLICIT_DEF and insert Add G_FREEZE legalization actions to AMDGPULegalizerInfo. Use the same legalization actions as G_IMPLICIT_DEF. Depends on D77795. Reviewers: dsanders, arsenm, aqjune, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson Reviewed By: arsenm Subscribers: kzhuravl, yaxunl, dstuttard, tpr, t-tye, jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78092	2020-04-17 16:44:46 +02:00
jasonliu	77618cc237	[XCOFF][AIX] Fix getSymbol to return the correct qualname when necessary Summary: AIX symbol have qualname and unqualified name. The stock getSymbol could only return unqualified name, which leads us to patch many caller side(lowerConstant, getMCSymbolForTOCPseudoMO). So we should try to address this problem in the callee side(getSymbol) and clean up the caller side instead. Note: this is a "mostly" NFC patch, with a fix for the original lowerConstant behavior. Differential Revision: https://reviews.llvm.org/D78045	2020-04-17 13:45:14 +00:00
Fraser Cormack	c819ef9653	Provide operand indices to adjustSchedDependency This allows targets to know exactly which operands are contributing to the dependency, which is required for targets with per-operand scheduling models. Differential Revision: https://reviews.llvm.org/D77135	2020-04-17 11:08:44 +01:00
Craig Topper	944cc5e0ab	[SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform base handling to CGP. I've always found the "findValue" a little odd and inconsistent with other things in SDB. This simplfifies the code in SDB to just handle a splat constant address or a 2 operand GEP in the same BB. This removes the need for "findValue" since the operands to the GEP are guaranteed to be available. The splat constant handling is new, but was needed to avoid regressions due to constant folding combining GEPs created in CGP. CGP is now responsible for canonicalizing gather/scatters into this form. The pattern I'm using for scalarizing, a scalar GEP followed by a GEP with an all zeroes index, seems to be subject to constant folding that the insertelement+shufflevector was not. Differential Revision: https://reviews.llvm.org/D76947	2020-04-16 17:49:22 -07:00
Wouter van Oortmerssen	48139ebc3a	[WebAssembly] Add int32 DW_OP_WASM_location variant This to allow us to add reloctable global indices as a symbol. Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it. See discussion in https://github.com/WebAssembly/debugging/issues/12	2020-04-16 16:32:17 -07:00
David Green	8e8c3c3408	[ARM] Mir test for machine sinking multiple def instructions. NFC	2020-04-16 20:58:14 +01:00
bd1976llvm	86478d3de9	[MC][ELF] Put explicit section name symbols into entry size compatible sections Ensure that symbols explicitly* assigned a section name are placed into a section with a compatible entry size. This is done by creating multiple sections with the same name** if incompatible symbols are explicitly given the name of an incompatible section, whilst: - Avoiding using uniqued sections where possible (for readability and to maximize compatibly with assemblers). - Creating as few SHF_MERGE sections as possible (for efficiency). Given that each symbol is assigned to a section in a single pass, we must decide which section each symbol is assigned to without seeing the properties of all symbols. A stable and easy to understand assignment is desirable. The following rules facilitate this: The "generic" section for a given section name will be mergeable if the name is a mergeable "default" section name (such as .debug_str), a mergeable "implicit" section name (such as .rodata.str2.2), or MC has already created a mergeable "generic" section for the given section name (e.g. in response to a section directive in inline assembly). Otherwise, the "generic" section for a given name is non-mergeable; and, non-mergeable symbols are assigned to the "generic" section, while mergeable symbols are assigned to uniqued sections. Terminology: "default" sections are those always created by MC initially, e.g. .text or .debug_str. "implicit" sections are those created normally by MC in response to the symbols that it encounters, i.e. in the absence of an explicit section name assignment on the symbol, e.g. a function foo might be placed into a .text.foo section. "generic" sections are those that are referred to when a unique section ID is not supplied, e.g. if there are multiple unique .bob sections then ".quad .bob" will reference the generic .bob section. Typically, the generic section is just the first section of a given name to be created. Default sections are always generic. * Typically, section names might be explicitly assigned in source code using a language extension e.g. a section attribute: _attribute_ ((section ("section-name"))) - https://clang.llvm.org/docs/AttributeReference.html ** I refer to such sections as unique/uniqued sections. In assembly the ", unique," assembly syntax is used to express such sections. Fixes https://bugs.llvm.org/show_bug.cgi?id=43457. See https://reviews.llvm.org/D68101 for previous discussions leading to this patch. Some minor fixes were required to LLVM's tests, for tests had been using the old behavior - which allowed for explicitly assigning globals with incompatible entry sizes to a section. This fix relies on the ",unique ," assembly feature. This feature is not available until bintuils version 2.35 (https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the integrated assembler is not being used then we avoid using this feature for compatibility and instead try to place mergeable symbols into non-mergeable sections or issue an error otherwise. Differential Revision: https://reviews.llvm.org/D72194	2020-04-16 19:12:49 +00:00
Amy Huang	2b8c6acc39	Reland "[codeview] Reference types in type parent scopes" Summary: Original description (https://reviews.llvm/org/D69924) Without this change, when a nested tag type of any kind (enum, class, struct, union) is used as a variable type, it is emitted without emitting the parent type. In CodeView, parent types point to their inner types, and inner types do not point back to their parents. We already walk over all of the parent scopes to build the fully qualified name. This change simply requests their type indices as we go along to enusre they are all emitted. Now, while walking over the parent scopes, add the types to DeferredCompleteTypes, since they might already be in the process of being emitted. Fixes PR43905 Reviewers: rnk, amccarth Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78249	2020-04-16 12:08:52 -07:00
Daniel Sanders	d9085f65db	[globalisel] Add lost debug locations verifier Summary: This verifier tries to ensure that DebugLoc's don't just disappear as we transform the MIR. It observes the instructions created, erased, and changed and at checkpoints chosen by the client algorithm verifies the locations affected by those changes. In particular, it verifies that: * Every DebugLoc for an erased/changing instruction is still present on at least one new/changed instruction * Failing that, that there is a line-0 location in the new/changed instructions. It's not possible to confirm which locations were merged so it conservatively assumes all unaccounted for locations are accounted for by any line-0 location to avoid false positives. If that fails, it prints the lost locations in the debug output along with the instructions that should have accounted for them. In theory, this is usable by the legalizer, combiner, selector and any other pass that performs incremental changes to the MIR. However, it has so far only really been tested on the legalizer (not including the artifact combiner) where it has caught lots of lost locations, particularly in Custom legalizations. There's only one example here as my initial testing was on an out-of-tree target and I haven't done a pass over the in-tree targets yet. Depends on D77575, D77446 Reviewers: bogner, aprantl, vsk Subscribers: jvesely, nhaehnle, mgorny, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77576	2020-04-16 10:43:35 -07:00
Daniel Sanders	7c6ca18fff	[globalisel] Allow backends to report an issue without triggering fallback. NFC Summary: This will allow us to fix the issue where the lost locations verifier causes CodeGen changes on lost locations because it falls back on DAGISel Reviewers: qcolombet, bogner, aprantl, vsk, paquette Subscribers: rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78261	2020-04-16 10:43:35 -07:00
David Green	44c4ba34d0	[MachineSink] Fix for breaking phi edges with instructions with multiple defs BreakPHIEdge would be set based on whether the instruction needs to insert a new critical edge to allow sinking into a block where the uses are PHI nodes. But for instructions with multiple defs it would be reset on the second def, allowing the instruciton to sink where it should not. Fixes PR44981 Differential Revision: https://reviews.llvm.org/D78087	2020-04-16 16:42:07 +01:00
Konstantin Schwarz	1a3e89aa2b	[MIR] Add comments to INLINEASM immediate flag MachineOperands Summary: The INLINEASM MIR instructions use immediate operands to encode the values of some operands. The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature. Reviewers: SjoerdMeijer, arsenm, efriedma Reviewed By: arsenm Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78088	2020-04-16 13:46:14 +02:00
Carl Ritson	43e2460a89	[LiveIntervals] Replace handleMoveIntoBundle Summary: The current handleMoveIntoBundle implementation is unusable, it attempts to access the slot indexes of bundled instructions. It also leaves bundled instructions with slot indexes assigned. Replace handleMoveIntoBundle this with a more explicit handleMoveIntoNewBundle function which recalculates the live intervals for all instructions moved into a newly formed bundle, and removes slot indexes from these instructions. Reviewers: arsenm, MaskRay, kariddi, tpr, qcolombet Reviewed By: qcolombet Subscribers: MatzeB, wdng, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77969	2020-04-16 19:58:19 +09:00
Jeremy Morse	c8d6fa5134	[LiveDebugValues] Terminate open ranges on DBG_VALUE $noreg In D68209, LiveDebugValues::transferDebugValue had a call to OpenRanges.erase shifted, and by accident this led to a code path where DBG_VALUEs of $noreg would not have their open range terminated, allowing variable locations to extend past blocks where they were terminated. This patch correctly terminates the open range, if present, when such a DBG_VAUE is encountered, and adds a test for this behaviour. Differential Revision: https://reviews.llvm.org/D78218	2020-04-16 10:26:47 +01:00
Craig Topper	8e1408695c	[CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC Differential Revision: https://reviews.llvm.org/D78083	2020-04-15 22:43:41 -07:00
Fangrui Song	7d1ff446b6	[MC] Rename MCSection::getSectionName() to getName(). NFC A pending change will merge MCSection::getName() to MCSection::getName().	2020-04-15 16:48:14 -07:00
Josh Stone	5a0d8c31a3	[NFC] correct "thier" to "their"	2020-04-15 14:38:52 -07:00
Eli Friedman	7c10541e56	[SelectionDAG] Fix usage of Align constructing MachineMemOperands. The "Align" passed into getMachineMemOperand etc. is the alignment of the MachinePointerInfo, not the alignment of the memory operation. (getAlign() on a MachineMemOperand automatically reduces the alignment to account for this.) We were passing on wrong (overconservative) alignment in a bunch of places. Fix a bunch of these, mostly in legalization. And while I'm here, switch to the new Align APIs. The test changes are all scheduling changes: the biggest effect of preserving large alignments is that it improves alias analysis, so the scheduler has more freedom. (I was originally just trying to do a minor cleanup in SelectionDAGBuilder, but I accidentally went deeper down the rabbit hole.) Differential Revision: https://reviews.llvm.org/D77687	2020-04-15 13:01:41 -07:00
Dominik Montada	443c244cff	[GlobalISel] translate freeze to new generic G_FREEZE Summary: As a follow up to https://reviews.llvm.org/D29014, add translation support for freeze. Introduce a new generic instruction G_FREEZE and translate freeze to it. Reviewers: dsanders, aqjune, arsenm, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson Reviewed By: aqjune, arsenm Subscribers: fhahn, lebedev.ri, wdng, rovka, hiraditya, jfb, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77795	2020-04-15 16:47:05 +02:00
Benjamin Kramer	d790bd3999	Unbreak the build	2020-04-15 15:54:47 +02:00
Victor Campos	d85b3877dc	[CodeGen][ARM] Error when writing to specific reserved registers in inline asm Summary: No error or warning is emitted when specific reserved registers are written to in inline assembly. Therefore, writes to the program counter or to the frame pointer, for instance, were permitted, which could have led to undesirable behaviour. Example: int foo() { register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM __asm__ __volatile__("mov %0, r1" : "=r"(a) : : ); return a; } In contrast, GCC issues an error in the same scenario. This patch detects writes to specific reserved registers in inline assembly for ARM and emits an error in such case. The detection works for output and input operands. Clobber operands are not handled here: they are already covered at a later point in AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers covered are: program counter, frame pointer and base pointer. This is ARM only. Therefore the implementation of other targets' counterparts remain open to do. Reviewers: efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76848	2020-04-15 14:40:42 +01:00
Denis Antrushin	edbb27ccb6	[Statepoint] Add getters to StatepointOpers. To simplify future work on statepoint representation, hide direct access to statepoint field indices and provide getters for them. Add getters for couple more statepoint fields. This also fixes two bugs in MachineVerifier for statepoint: First, the `break` statement was falling out of `if` statement scope, thus disabling following checks. Second, it was incorrectly accessing some fields like CallingConv - StatepointOpers gives index to their value directly, not to preceeding field type encoding. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D78119	2020-04-15 14:31:42 +03:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
QingShan Zhang	c9f9c79c5a	[NFC][DAGCombine] Change the value of NegatibleCost to make it align with the semantics This is a minor NFC change to make the code more clear. We have the NegatibleCost that has cheaper, neutral, and expensive. Typically, the smaller one means the less cost. It is inverse for current implementation, which makes following code not easy to read. If (CostX > CostY) negate(X) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D77993	2020-04-15 02:20:58 +00:00
Sam Clegg	3ea1c62cba	[WebAssembly] Emit .llvmcmd and .llvmbc as custom sections Fixes: https://bugs.llvm.org/show_bug.cgi?id=45362 Differential Revision: https://reviews.llvm.org/D77115	2020-04-14 13:24:18 -07:00
Thomas Raoux	c228c717aa	[AntidepBreaker] Move AntiDepBreaker to include folder. This allows AntiDepBreaker to be used in target specific postRA scheduler. Differential Revision: https://reviews.llvm.org/D78047	2020-04-14 11:40:57 -07:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Craig Topper	3043093822	[CallSite removal][CodeGen] Replace ImmutableCallSite with CallBase in isInTailCallPosition.	2020-04-13 23:04:57 -07:00
Mircea Trofin	4aae4e3f48	[llvm][NFC] CallSite removal from inliner-related files Summary: This removes CallSite from inliner files. Some dependencies where thus affected. Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77991	2020-04-13 21:28:58 -07:00
Craig Topper	113f37a1f9	[CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase Differential Revision: https://reviews.llvm.org/D77995	2020-04-13 13:50:15 -07:00
Rahman Lavaee	05192e585c	Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section. Differential Revision: https://reviews.llvm.org/D76954	2020-04-13 12:19:59 -07:00
Rahman Lavaee	4ddf7ab454	Revert "Extend BasicBlock sections to allow specifying clusters of basic blocks" This reverts commit `0d4ec16d3d` Because tests were not added to the commit.	2020-04-13 12:19:59 -07:00
Rahman Lavaee	0d4ec16d3d	Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section. This allows specifying BasicBlock clusters like the following example: !foo !!0 1 2 !!4 This places basic blocks 0, 1, and 2 in one section in this order, and places basic block #4 in a single section of its own.	2020-04-13 11:46:11 -07:00
Vedant Kumar	122a6bfb07	[Debugify] Strip added metadata in the -debugify-each pipeline Summary: Share logic to strip debugify metadata between the IR and MIR level debugify passes. This makes it simpler to hunt for bugs by diffing IR with vs. without -debugify-each turned on. As a drive-by, fix an issue causing CallGraphNodes to become invalid when a dead llvm.dbg.value prototype is deleted. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77915	2020-04-13 10:55:17 -07:00
Craig Topper	68eb08646c	[CallSite removal][GlobalISel] Use CallBase instead of CallSite in lowerCall and translateCallBase. Differential Revision: https://reviews.llvm.org/D78001	2020-04-13 10:31:30 -07:00
Matt Arsenault	e6605a209c	DAG: Fix wrong legality check for ISD::FMAD Since `1725f28841`, this should check isFMADLegalForFAddFSub rather than the the plain isOperationLegal. This would assert in a subset of cases due to an oddity in how FMAD is selected. We will allow FMA formation pre-legalize, but not FMAD even in cases where it would be valid. The current hook requires passing in the root fadd/fsub. However, in this distributed case, this would be far more complicated to pass in the relevant operand. AMDGPU doesn't get any value from the node, and only needs the type and is the only implementor, so I'm not sure why we have this complexity. Just rename and expand the assert to avoid the more complicated checks spread through the distribution logic.	2020-04-13 10:25:39 -07:00
Craig Topper	f06cf9da89	[CallSite removal][CodeGen] Use CallBase instead of CallSite in getNoopInput in Analysis.cpp. NFC	2020-04-13 00:20:12 -07:00
Craig Topper	5889c5a814	[CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in TargetFrameLoweringInfo. NFC	2020-04-13 00:20:12 -07:00
Craig Topper	e59162960c	[CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in IntrinsicLowering. NFC	2020-04-13 00:19:27 -07:00
Craig Topper	83208cdd57	[CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in WinEHPrepare. NFC	2020-04-13 00:19:27 -07:00
Craig Topper	42487eafa6	[CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in SwiftErrorValueTracking. NFC	2020-04-13 00:19:27 -07:00
Craig Topper	dbb272b0a3	[CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall.	2020-04-12 18:02:24 -07:00
Craig Topper	95192f548d	[CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface. Differential Revision: https://reviews.llvm.org/D77929	2020-04-12 11:26:25 -07:00
Jonathan Roelofs	41f13f1f64	reland: [DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Differential Revision: https://reviews.llvm.org/D76994 Reviewed-By: bjope Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 Reverted in `3ce77142a6` because the previous patch broke the expensive-checks bots. The new patch removes the broken check.	2020-04-12 09:52:17 -06:00
Craig Topper	5b42399029	[CallSite removal][FastISel] Remove uses of CallSite. Differential Revision: https://reviews.llvm.org/D77933	2020-04-11 20:52:45 -07:00
Craig Topper	806763efcf	[CallSite removal][SelectionDAGBuilder] Use CallBase instead of ImmutableCallSite in visitPatchpoint. Differential Revision: https://reviews.llvm.org/D77932	2020-04-11 13:07:31 -07:00
Matt Arsenault	1747ba25b2	GlobalISel: Fix typo in assert message	2020-04-11 16:02:26 -04:00
Hongtao Yu	11455a7905	[CodeGen] Allow partial tail duplication in Machine Block Placement. Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens. Reviewers: wenlei, davidxl, Carrot Reviewed By: Carrot Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77748	2020-04-11 12:20:31 -07:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Simon Pilgrim	89f6ca05b7	CodeGen/EdgeBundles - move Twine.h include down into EdgeBundles.cpp. NFC. EdgeBundles.h has no use for it.	2020-04-11 12:21:04 +01:00
Craig Topper	9c1842d8af	Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This is the same as what was done to the CallLoweringInfo in TargetLowering.h in r309159. This is just a step on the way to replacing this with CallBase.	2020-04-10 23:45:36 -07:00
Craig Topper	f49f6cf91e	[CallSite removal][SelectionDAGBuilder] Remove most CallSite usage from visitInlineAsm. I only left it at the interface to ParseConstraints since that needs updates to other callers in different files. I'll do that as a follow up. Differential Revision: https://reviews.llvm.org/D77892	2020-04-10 19:23:33 -07:00
Matt Arsenault	49ae0fc2f0	GlobalISel: Fix incorrect lowering G_FCOPYSIGN In the basic case, this was reading the sign from the wrong operand.	2020-04-10 21:00:25 -04:00
Daniel Sanders	f71350f05a	Add -debugify-and-strip-all to add debug info before a pass and remove it after Summary: This allows us to test each backend pass under the presence of debug info using pre-existing tests. The tests should not fail as a result of this so long as it's true that debug info does not affect CodeGen. In practice, a few tests are sensitive to this: * Tests that check the pass structure (e.g. O0-pipeline.ll) * Tests that check --debug output. Specifically instruction dumps containing MMO's (e.g. prelegalizercombiner-extends.ll) * Tests that contain debugify metadata as mir-strip-debug will remove it (e.g. fastisel-debugvalue-undef.ll) * Tests with partial debug info (e.g. patchable-function-entry-empty.mir had debug info but no !llvm.dbg.cu) * Tests that check optimization remarks overly strictly (e.g. prologue-epilogue-remarks.mir) * Tests that would inject the pass in an unsafe region (e.g. seqpairspill.mir would inject between register alloc and virt reg rewriter) In all cases, the checks can either be updated or --debugify-and-strip-all-safe=0 can be used to avoid being affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe' I tested this without the lost debug locations verifier to confirm that AArch64 behaviour is unaffected (with the fixes in this patch) and with it to confirm it finds the problems without the additional RUN lines we had before. Depends on D77886, D77887, D77747 Reviewers: aprantl, vsk, bogner Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77888	2020-04-10 16:36:07 -07:00
Daniel Sanders	dfca98d6a8	[mir-strip-debug] Optionally preserve debug info that wasn't from debugify/mir-debugify Summary: A few tests start out with debug info and expect it to reach the output. For these tests we shouldn't strip the debug info Reviewers: aprantl, vsk, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77886	2020-04-10 15:24:14 -07:00
Christopher Tetreault	889f6606ed	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: stoklund, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77272	2020-04-10 14:53:43 -07:00
Daniel Sanders	c162bc2aed	Make TargetPassConfig and llc add pre/post passes the same way. NFC Summary: At the moment, any changes we make to the passes that can be injected before/after others (e.g. -verify-machineinstrs and -print-after-all) have to be duplicated in both TargetPassConfig (for normal execution, -start-before/ -stop-before/etc) and llc (for -run-pass). Unify this pass injection into addMachinePrePass/addMachinePostPass that both TargetPassConfig and llc can use. Reviewers: vsk, aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77887	2020-04-10 13:46:53 -07:00
Marcello Maggioni	ea11f4726f	Split LiveRangeCalc in LiveRangeCalc/LiveIntervalCalc. NFC Summary: Refactor LiveRangeCalc such that it is now split into two classes The objective is to split all the "register specific" logic away from LiveRangeCalc. The two new classes created are: - LiveRangeCalc - is meant as a generic class to compute and modify live ranges in a generic way. This class should deal only with SlotIndices and VNInfo objects. - LiveIntervalCals - is meant to be equivalent to the old LiveRangeCalc. It computes the liveness virtual registers tracked by a LiveInterval object. With this refactoring LiveRangeCalc can be used to implement tracking of liveness of LiveRanges that represent other things than just registers. Subscribers: MatzeB, qcolombet, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76584	2020-04-10 11:26:21 -07:00
Sumanth Gundapaneni	a04ab2ec08	[Pipeliner] Fix the bug in pragma that disables the pipeliner. Differential Revision: https://reviews.llvm.org/D76303.	2020-04-10 12:52:16 -05:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Serguei Katkov	4275eb1331	Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797	2020-04-10 10:13:39 +07:00
Francesco Petrogalli	c846d2682b	[llvm][Codegen] Make `getVectorTypeBreakdownMVT` work with scalable types. Reviewers: efriedma, andwar, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77434	2020-04-10 00:48:27 +01:00
Daniel Sanders	a79b2fc44b	Add pass to strip debug info from MIR Summary: Removes: * All LLVM-IR level debug info using StripDebugInfo() * All debugify metadata * 'Debug Info Version' module flag * All (valid) DEBUG_VALUE MachineInstrs All DebugLocs from MachineInstrs This is a more complete solution than the previous MIRPrinter option that just causes it to neglect to print debug-locations. * The qualifier 'valid' is used here because AArch64 emits an invalid one and tests depend on it Reviewers: vsk, aprantl, bogner Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77747	2020-04-09 15:44:38 -07:00
Serguei Katkov	44f0d7f136	Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values." This reverts commit `a0275705bb`. It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.	2020-04-09 18:24:47 +07:00
Serguei Katkov	a0275705bb	[Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77371	2020-04-09 16:57:35 +07:00
Jay Foad	bf730e1686	[CodeGen] Fix a simple FIXME. NFC.	2020-04-09 10:54:03 +01:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Matt Arsenault	0aa0d70067	MIR: Use Register	2020-04-08 22:07:26 -04:00
Amara Emerson	befc788cfa	GlobalISel: Add a setInstrAndDebugLoc(MachineInstr&) convenience helper to MachineIRBuilder. NFC. This saves doing two separate calls to set the Instr and DebugLoc from an existing MI.	2020-04-08 14:38:33 -07:00
Matt Arsenault	e49e33b610	CodeGen: Use Register in MachineInstrBuilder	2020-04-08 17:03:53 -04:00
Matt Arsenault	c42cc7fd24	CodeGen: Use Register in MachineSSAUpdater	2020-04-08 14:29:01 -04:00
Vedant Kumar	48e65fc630	MachineFunction: Copy call site info when duplicating insts Summary: Preserve call site info for duplicated instructions. We copy over the call site info in CloneMachineInstrBundle to avoid repeated calls to copyCallSiteInfo in CloneMachineInstr. (Alternatively, we could copy call site info higher up the stack, e.g. into TargetInstrInfo::duplicate, or even into individual backend passes. However, I don't see how that would be safer or more general than the current approach.) Reviewers: aprantl, djtodoro, dstenb Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77685	2020-04-08 11:06:14 -07:00
Matt Arsenault	586769cce2	DAG: Use Register	2020-04-08 13:44:31 -04:00
Matt Arsenault	dcce3ef1d2	FastISel: Partially use Register Doesn't try to convert the cases that depend on generated code.	2020-04-08 12:10:58 -04:00
Matt Arsenault	7a46e36d51	CodeGen: Use Register more in CallLowering Some of these MCPhysReg uses should probably be MCRegister, but right now this would require more invasive changes.	2020-04-08 12:10:58 -04:00
Matt Arsenault	ca0ace7298	CodeGen: Use Register in MachineBasicBlock	2020-04-08 12:10:58 -04:00
Jeremy Morse	c77887e4d1	[DebugInfo][NFC] Early-exit when analyzing for single-location variables This is a performance patch that hoists two conditions in DwarfDebug's validThroughout to avoid a linear-scan of all instructions in a block. We now exit early if validThrougout will never return true for the variable location. The first added clause filters for the two circumstances where validThroughout will return true. The second added clause should be identical to the one that's deleted from after the linear-scan. Differential Revision: https://reviews.llvm.org/D77639	2020-04-08 12:27:11 +01:00
Mikael Holmen	893df2032d	[IfConversion] Disallow TrueBB == FalseBB for valid diamonds Summary: This fixes PR45302. Previously the case BB1 / \ \| \| TBB FBB \| \| \ / BB2 was treated as a valid diamond also when TBB and FBB was the same basic block. This then lead to a failed assertion in IfConvertDiamond. Since TBB == FBB is quite a degenerated case of a diamond, we now don't treat it as a valid diamond anymore, and thus we will avoid the trouble of making IfConvertDiamond handle it correctly. Reviewers: efriedma, kparzysz Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77651	2020-04-08 12:50:36 +02:00
Dominik Montada	35950fea8d	[GlobalISel] support narrow G_IMPLICIT_DEF for DstSize % NarrowSize != 0 Summary: When narrowing G_IMPLICIT_DEF where the original size is not a multiple of the narrow size, emit a smaller G_IMPLICIT_DEF and use G_ANYEXT. To prevent a potential endless loop in the legalizer, the condition to combine G_ANYEXT(G_IMPLICIT_DEF) is changed from isInstUnsupported to !isInstLegal, since in this case the combine is only valid if consequent legalization of the newly combined G_IMPLICIT_DEF does not introduce G_ANYEXT due to narrowing. Although this legalization for G_IMPLICIT_DEF would also be valid for the general case, it actually caused a lot of code regressions when tried due to superfluous COPYs and combines not getting hit anymore. Reviewers: dsanders, aemerson, volkan, arsenm, aditya_nandakumar Reviewed By: arsenm Subscribers: jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76598	2020-04-08 11:00:07 +02:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Matt Arsenault	6011627f51	CodeGen: More conversions to use Register	2020-04-07 18:54:36 -04:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Matt Arsenault	aa26dd9858	CodeGen: Use Register in more places	2020-04-07 15:59:40 -04:00
Craig Topper	c41685b16f	[SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector. This removes a call to getScalarType from a bunch of call sites. It also makes the behavior consistent with SIGN_EXTEND_INREG. Differential Revision: https://reviews.llvm.org/D77631	2020-04-07 11:34:08 -07:00
Matt Arsenault	b281138a1b	DAG: Use the correct getPointerTy in a few places These should not be assuming address space 0. Calling getPointerTy is generally the wrong thing to do, since you should already know the type from the incoming IR.	2020-04-07 12:45:41 -04:00
Nikita Popov	259649a519	[RDA] Avoid full reprocessing of blocks in loops (NFCI) RDA sometimes needs to visit blocks twice, to take into account reaching defs coming in along loop back edges. Currently it handles repeated visitation the same way as usual, which means that it will scan through all instructions and their reg unit defs again. Not only is this very inefficient, it also means that all reaching defs in loops are going to be inserted twice. We can do much better than this. The only thing we need to handle is a new reaching def from a predecessor, which either needs to be prepended to the reaching definitions (if there was no reaching def from a predecessor), or needs to replace an existing predecessor reaching def, if it is more recent. Since D77508 we only store the most recent predecessor reaching def, so that's the only one that may need updating. This also has the nice side-effect that reaching definitions are now automatically sorted and unique, so drop the llvm::sort() call in favor of an assertion. Differential Revision: https://reviews.llvm.org/D77511	2020-04-07 17:55:37 +02:00
Nikita Popov	76e987b372	[RDA] Don't pass down TraversedMBB (NFC) Only pass the MachineBasicBlock itself down to helper methods, they don't need to know about traversal. Move the debug print into the main method.	2020-04-07 17:53:04 +02:00
Nikita Popov	361c29d7ba	[RDA] Avoid inserting duplicate reaching defs (NFCI) An instruction may define the same reg unit multiple times, avoid inserting the same reaching def multiple times in that case. Also print the reg unit, rather than the super-register, in the debug code.	2020-04-07 17:50:38 +02:00
Serguei Katkov	b7e3759e17	[DAG] Consolidate require spill slot logic in lambda. NFC. Move the logic whether lowering of deopt value requires a spill slot in a separate lambda. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77629	2020-04-07 16:43:47 +07:00
Pierre-vh	4fc59a468f	Revert "[CodeGen][SelectionDAG] Flip Booleans More Often" This reverts commit `23342bdcc8`.	2020-04-07 09:09:10 +01:00
Pierre-vh	23342bdcc8	[CodeGen][SelectionDAG] Flip Booleans More Often Differential Revision: https://reviews.llvm.org/D77201	2020-04-07 08:19:57 +01:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Davide Italiano	8115e08b05	[MachineCSE] Don't carry the wrong location when hoisting PR: 45425 <rdar://problem/61359768> Differential Revision: https://reviews.llvm.org/D77604	2020-04-06 16:36:22 -07:00
Daniel Sanders	f27cea721e	Add way to omit debug-location from MIR output Summary: In lieu of a proper pass that strips debug info, add a way to omit debug-locations from the MIR output so that instructions with MMO's continue to match CHECK's when mir-debugify is used Reviewers: aprantl, bogner, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77575	2020-04-06 16:22:01 -07:00
Daniel Sanders	35b7b0851b	Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify) Summary: To debugify MIR, we need to be able to create metadata and to do that, we need a non-const Module. However, MachineFunction only had a const reference to the Function preventing this. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77439	2020-04-06 15:19:21 -07:00
Leonard Chan	a0222ac1f9	[AsmPrinter] Do not define local aliases for global objects in a comdat A global symbol that is defined in a comdat should not generate an alias since call sites that would've referred to that symbol will refer to their own independent local aliases rather than the surviving global comdat one. This could result in something that looks like: ``` ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o) >>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o) >>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169) >>> minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a ``` We ran into this when experimenting with a new C++ ABI for fuchsia (refer to D72959) which takes relative offsets between comdat'd functions which is why the normal C++ user wouldn't run into this. Differential Revision: https://reviews.llvm.org/D77429	2020-04-06 13:48:05 -07:00
Nick Desaulniers	5bc291be71	[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent Summary: A bug report mentioned that LLVM was producing jumps off the end of a function when using "asm goto with outputs". Further digging pointed to MachineBasicBlocks that had their address taken and were indirect targets of INLINEASM_BR being removed by BranchFolder, because their predecessor list was empty, so they appeared to have no entry. This was a cascading failure caused earlier, during Pre-RA instruction scheduling. We have a few special cases in Pre-RA instruction scheduling where we split a MachineBasicBlock in two. This requires careful handing of predecessor and successor lists for a MachineBasicBlock that was split, and careful handing of PHI MachineInstrs that referred to the MachineBasicBlock before it was split. The clue that led to this fix was the observation that many callers of MachineBasicBlock::splice() frequently call MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI nodes after a splice. We don't want to reuse that method, as we have custom successor transferring logic for this block split. This patch fixes 2 pre-existing bugs, and adds tests. The first bug was that MachineBasicBlock::splice() correctly handles updating most successors and predecessors; we don't need to do anything more than removing the previous fallthrough block from the first half of the split block post splice. Previously, we were updating the successor list incorrectly (updating successors updates predecessors). The second bug was that PHI nodes that needed registers from the first half of the split block were not having entries populated. The register live out information was correct, and the FuncInfo->PHINodesToUpdate was correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock: for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) { MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first); if (!FuncInfo->MBB->isSuccessor(PHI->getParent())) continue; PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB); was `continue`ing because FuncInfo->MBB tracks the second half of the post-split block; no one was updating PHI entries for the first half of the post-split block. SelectionDAGBuilder::UpdateSplitBlock() already expects to perform special handling for MachineBasicBlocks that were split post calls to ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the current MachineBasicBlock being processed), and perform special handling for this in SelectionDAGBuilder::UpdateSplitBlock(). Reviewers: void, craig.topper, efriedma Reviewed By: void, efriedma Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D76961	2020-04-06 13:46:39 -07:00
Francesco Petrogalli	53b7abdd23	[llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`. Reviewers: sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77317	2020-04-06 19:46:11 +01:00
Craig Topper	07ed1fb597	[SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types. The previous code used the type of the first field for the VT passed to getNode for every field. I've based the implementation here off what is done in visitSelect as it removes the need to special case aggregates. Differential Revision: https://reviews.llvm.org/D77093	2020-04-06 11:03:40 -07:00
Nikita Popov	e8b83f7ddc	[RDA] Only store most recent reaching def from predecessors (NFCI) When entering a basic block, RDA inserts reaching definitions coming from predecessor blocks (which will be negative numbers) in a rather peculiar way. If you have incoming reaching definitions -4, -3, -2, -1, it will insert those. If you have incoming reaching definitions -1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken at each step. That's probably not what was intended... However, RDA only actually cares about the most recent reaching definition from a predecessor (to calculate clearance), so this ends up working fine as far as behavior is concerned. It does waste memory on unnecessary reaching definitions though. This patch changes the implementation to first compute the most recent reaching definition in one loop, and then insert only that one in a separate loop. Differential Revision: https://reviews.llvm.org/D77508	2020-04-06 18:39:09 +02:00
Nikita Popov	8d75df1438	[RDA] Don't adjust ReachingDefDefaultVal (NFCI) At the end of a basic block, RDA adjusts all the reaching defs it found to be relative to the end of the basic block, rather than the start of it. However, it also does this to registers which don't have a reaching def, indicated by ReachingDefDefaultVal. This means that code checking against ReachingDefDefaultVal will not skip them, and may insert them into the reaching definition list. This is ultimately harmless, but causes unnecessary work and is logically not right. Differential Revision: https://reviews.llvm.org/D77506	2020-04-06 18:36:29 +02:00
Matt Arsenault	70726cec5b	DAG: Combine extract_vector_elt of concat_vectors Fixes extra canonicalize regressions when legalizing vector fminnum/fmaxnum.	2020-04-06 09:26:29 -04:00
Sourabh Singh Tomar	5d7e9adce2	[DWARF5] Added support for emission of debug_macro section. Summary: This patch adds support for emission of following DWARFv5 macro forms in .debug_macro section. 1. DW_MACRO_start_file 2. DW_MACRO_end_file 3. DW_MACRO_define_strp 4. DW_MACRO_undef_strp. Reviewed By: dblaikie, ikudrin Differential Revision: https://reviews.llvm.org/D72828	2020-04-06 17:45:10 +05:30
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Oliver Stannard	a294d9eb21	Revert "[IPRA][ARM] Spill extra registers at -Oz" Reverting because this is causing failures on bots with expensive checks enabled. This reverts commit `73cea83a6f`.	2020-04-06 10:34:59 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Craig Topper	97e57f3b24	[DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine. We're ANDing with 1 right after which will cause the SIGN_EXTEND to be combined to ANY_EXTEND later. Might as well just start with an ANY_EXTEND. While there replace create the AND using the getZeroExtendInReg helper to remove the need to explicitly create the VecOnes constant.	2020-04-05 22:44:45 -07:00
Craig Topper	586c051a27	[DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect. This code is replacing a shift with a new shift on an extended type. If the shift amount type can't represent the maximum shift amount for the new type, the amount needs to be extended to a type that can. Previously, the code just hardcoded a check for 256 bits which seems to have been an assumption that the original shift amount was MVT::i8. But that seems more catered to a specific target like X86 that uses i8 as its legal shift amount type. Other targets may use different types. This commit changes the code to look at the real type of the shift amount and makes sure it has enough bits for the Log2 of the new type. There are similar checks to this in SelectionDAGBuilder and LegalizeIntegerTypes.	2020-04-05 20:35:57 -07:00
Sourabh Singh Tomar	0d71782f4e	[DebugInfo]: Allow DwarfCompileUnit to have line table symbol Previously line table symbol was represented as `DIE::value_iterator` inside `DwarfCompileUnit` and subsequent function `intStmtList` was used to create a local `MCSymbol` to initialize it. This patch removes `DIE::value_iterator` from `DwarfCompileUnit` and intoduce `MCSymbol` for representing this units symbol for `debug_line` section. As a result `applyStmtList` is also modified to utilize this. Further more a helper function `getLineTableStartSym` is also introduced to get this symbol, this would be used by clients which need to access this line table, i.e `debug_macro`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D77489	2020-04-06 00:14:29 +05:30
Zuojian Lin	a58c8a7866	Remove the additional constant which requires an extra register for statepoint lowering. The newly-created constant zero will need an extra register to hold it in the current statepoint lowering implementation. Remove it if there exists one.	2020-04-05 11:22:09 -04:00
Jonathan Roelofs	3ce77142a6	Revert "[DAG] Fix PR45049: LegalizeTypes crash" This reverts commit `17673ae0b2`.	2020-04-04 13:47:22 -06:00
Jonathan Roelofs	17673ae0b2	[DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 https://reviews.llvm.org/D76994	2020-04-04 13:36:22 -06:00
Heejin Ahn	fc5d8b672b	[WebAssembly] Fix a sanitizer error in WasmEHPrepare Summary: D77423 started using a dominator tree in WasmEHPrepare, but we deleted BBs in `prepareThrows` before we used the domtree in `prepareEHPads`, and those CFG changes were not reflected in the domtree. This uses `DomTreeUpdater` to make sure we update the domtree every time we delete BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in LLVM buildbots. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77465	2020-04-04 09:57:07 -07:00
Heejin Ahn	2e9839729d	[WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare Summary: When we insert a call to the personality function wrapper (`_Unwind_CallPersonality`) for a catch pad, we store some necessary info in `__wasm_lpad_context` struct and pass it. One of the info is the LSDA address for the function. For this, we insert a call to `wasm.lsda()`, which will be lowered down to the address of LSDA, and store it in a field in `__wasm_lpad_context`. There are exceptions to this personality call insertion: catchpads for `catch (...)` and cleanuppads (for destructors) don't need personality function calls, because we don't need to figure out whether the current exception should be caught or not. (They always should.) There was a little optimization to `wasm.lsda()` call insertion. Because the LSDA address is the same throughout a function, we don't need to insert a store of `wasm.lsda()` return value in every catchpad. For example: ``` try { foo(); } catch (int) { // wasm.lsda() call and a store are inserted here, like, in // pseudocode, // %lsda = wasm.lsda(); // store %lsda to a field in __wasm_lpad_context try { foo(); } catch (int) { // We don't need to insert the wasm.lsda() and store again, because // to arrive here, we have already stored the LSDA address to // __wasm_lpad_context in the outer catch. } } ``` So the previous algorithm checked if the current catch has a parent EH pad, we didn't insert a call to `wasm.lsda()` and its store. But this was incorrect, because what if the outer catch is `catch (...)` or a cleanuppad? ``` try { foo(); } catch (...) { // wasm.lsda() call and a store are NOT inserted here try { foo(); } catch (int) { // We need wasm.lsda() here! } } ``` In this case we need to insert `wasm.lsda()` in the inner catchpad, because the outer catchpad does not have one. To minimize the number of inserted `wasm.lsda()` calls and stores, we need a way to figure out whether we have encountered `wasm.lsda()` call in any of EH pads that dominates the current EH pad. To figure that out, we now visit EH pads in BFS order in the dominator tree so that we visit parent BBs first before visiting its child BBs in the domtree. We keep a set named `ExecutedLSDA`, which basically means "Do we have `wasm.lsda()` either in the current EH pad or any of its parent EH pads in the dominator tree?". This is to prevent scanning the domtree up to the root in the worst case every time we examine an EH pad: each EH pad only needs to examine its immediate parent EH pad. - If any of its parent EH pads in the domtree has `wasm.lsda()`, this means we don't need `wasm.lsda()` in the current EH pad. We also insert the current EH pad in `ExecutedLSDA` set. - If none of its parent EH pad has `wasm.lsda()` - If the current EH pad is a `catch (...)` or a cleanuppad, done. - If the current EH pad is neither a `catch (...)` nor a cleanuppad, add `wasm.lsda()` and the store in the current EH pad, and add the current EH pad to `ExecutedLSDA` set. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77423	2020-04-04 07:02:50 -07:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
jasonliu	d65557d15d	[NFC][XCOFF][AIX] Refactor get/setContainingCsect Summary: For current architect, we always require setContainingCsect to be called on every MCSymbol got used in XCOFF context. This is very hard to achieve because symbols gets created everywhere and other MCSymbol types(ELF, COFF) do not have similar rules. It's very easy to miss setting the containing csect, and we would need to add a lot of XCOFF specialized code around some common code area. This patch intendeds to do 1. Rely on getFragment().getParent() to get csect from labels. 2. Only use get/setRepresentedCsect (was get/setContainingCsect) if symbol itself represents a csect. Reviewers: DiggerLin, hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D77080	2020-04-03 13:33:12 +00:00
Guillaume Chatelet	9068bccbae	[Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77312	2020-04-03 13:21:58 +00:00
Guillaume Chatelet	ca11c480e7	[Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align Summary: The change in IRTranslator is not trivial but is NFC as far as I can tell. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77292	2020-04-03 09:05:19 +00:00
Guillaume Chatelet	9f5c786876	[NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77372	2020-04-03 08:12:39 +00:00
Serguei Katkov	bd1d70bf0e	[DAG] Change isGCValue detection for statepoint lowering isGCValue should detect whether the deopt value is a GC pointer. Currently it checks by finding the value in SI.Bases and SI.Ptrs. However these data structures contain only those values which have corresponding gc.relocate call. So we can miss GC value if it does not have gc.relocate call (dead after the call). Check GC strategy whether pointer is GC one or consider any pointer to be GC one conservatively. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77130	2020-04-03 12:36:13 +07:00
Simon Pilgrim	b02c7a8152	Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI. The shift of 1 by an amount that is never more than 31 means that the warning is a false positive but is safe and fixes Werror builds.	2020-04-02 12:02:04 +01:00
Guillaume Chatelet	96cae168fa	[NFC] Preparatory work for D77292	2020-04-02 09:30:33 +00:00
Guillaume Chatelet	189d2e215f	[Alignment][NFC] Use more Align versions of various functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77291	2020-04-02 09:00:53 +00:00
OCHyams	550ab58bc1	[NFC] Fix performance issue in LiveDebugVariables When compiling AMDGPUDisassembler.cpp in a stage 1 trunk build with CMAKE_BUILD_TYPE=RelWithDebInfo LLVM_USE_SANITIZER=Address LiveDebugVariables accounts for 21.5% wall clock time. This fix reduces that to 1.2% by switching out a linked list lookup with a map lookup. Note that the linked list is still used to group UserValues by vreg. The vreg lookups don't cause any problems in this pathological case. This is the same idea as D68816, which was reverted, except that it is a less intrusive fix. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D77226	2020-04-02 09:39:33 +01:00
Daniel Sanders	e65e677ee4	[globalisel][legalizer] Fix DebugLoc bugs caught by a prototype lost-location verifier The legalizer has a tendency to lose DebugLoc's when expanding or combining instructions. The verifier that detected these isn't ready for upstreaming yet but this patch fixes the cases that came up when applying it to our out-of-tree backend's CodeGen tests. This pattern comes up a few more times in this file and probably in the backends too but I'd prefer to fix the others separately (and preferably when the lost-location verifier detects them).	2020-04-01 12:50:18 -07:00
Jessica Clarke	616289ed29	[LegalizeTypes][RISCV] Correctly sign-extend comparison for ATOMIC_CMP_XCHG Summary: Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised with GetPromotedInteger, which leaves the upper bits of the value undefind. Since this is used for comparing in an LR/SC loop with a full-width comparison, we must sign extend it. We introduce a new getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since many targets have compare-and-swap instructions (or pseudos) that correctly handle an any-extend input, and the existing function determines the extension of the result, whereas we are concerned with the input. This is related to https://reviews.llvm.org/D58829, which solved the issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler ATOMIC_CMP_SWAP. Reviewers: asb, lenary, efriedma Reviewed By: asb Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74453	2020-04-01 15:51:26 +01:00
Guillaume Chatelet	1dffa2550b	[Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77215	2020-04-01 14:08:28 +00:00
Guillaume Chatelet	3a78f44daf	[Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77212	2020-04-01 13:22:11 +00:00
Guillaume Chatelet	bf573bea19	[Alignment][NFC] Convert MIR Yaml to MaybeAlign Summary: Although it may look like non NFC it is. especially the MIRParser may set `0` to the MachineFrameInfo and MachineFunction, but they all deal with `Align` internally and assume that `0` means `1`. `93fc0ba145/llvm/include/llvm/CodeGen/MachineFrameInfo.h (L483)` This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77203	2020-04-01 12:26:31 +00:00
Guillaume Chatelet	c7468c1696	[Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77149	2020-04-01 09:32:05 +00:00
Qiu Chaofan	95bcab8272	[DAGCombiner] Require ninf for sqrt recip estimation Currently, DAG combiner uses (fmul (rsqrt x) x) to estimate square root of x. However, this method would return NaN if x is +Inf, which is incorrect. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D76853	2020-04-01 16:23:43 +08:00
Craig Topper	f92563f907	[VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template Reviewers: spatel, efriedma, RKSimon Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77183	2020-04-01 00:46:48 -07:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Guozhi Wei	6d20937c29	[CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcall The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2. There 2 problems blocked the duplication: 1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts. 2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret. The solutions: 1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume. 2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it. Differential Revision: https://reviews.llvm.org/D76539	2020-03-31 11:55:51 -07:00
Guillaume Chatelet	998118c3d3	[Alignment][NFC] Deprecate MachineMemOperand::getMachineMemOperand version that takes an untyped alignement. Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77138	2020-03-31 16:05:31 +00:00
Guillaume Chatelet	b9810988b2	[Alignment][NFC] Transitionning more getMachineMemOperand call sites Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77127	2020-03-31 11:04:10 +00:00
Denis Antrushin	47107dc3bd	[Statepoint] Fix StatepointLoweringInfo::GCTransitionArgs initialization Summary: In method SelectionDAGBuilder::LowerStatepoint, array SI.GCTransitionArgs is initialized from wrong part of ImmutableStatepoint class. We copy gc args instead of transitions args. Reviewers: reames, skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77075	2020-03-31 11:45:06 +03:00
Guillaume Chatelet	c9d5c19597	[Alignment][NFC] Transitionning more getMachineMemOperand call sites Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, Jim, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77121	2020-03-31 08:36:18 +00:00
Guillaume Chatelet	d2d6c9f591	[Alignment][NFC] GlobalIsel Utils inferAlignFromPtrInfo Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77079	2020-03-31 06:58:57 +00:00
Guillaume Chatelet	af3c52d558	[Alignment][NFC] Simplify IRTranslator::getMemOpAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77078	2020-03-31 06:57:13 +00:00
Craig Topper	2a07221cf3	[SelectionDAG] Add an assert that the input VT and output VT for ISD::FREEZE are the same. Differential Revision: https://reviews.llvm.org/D77092	2020-03-30 23:21:58 -07:00
Jessica Paquette	d5ee72065b	[GlobalISel] Implement identity transforms for x op x -> x When we have ``` a = G_OR x, x ``` or ``` b = G_AND y, y ``` We can drop the G_OR/G_AND and just use x/y respectively. Also update arm64-fallback.ll because there was an or in there which hits this transformation. Differential Revision: https://reviews.llvm.org/D77105	2020-03-30 18:22:37 -07:00
Juneyoung Lee	519f5c3796	[LegalizeTypes] Add SoftenFloatRes_FREEZE Summary: This adds SoftenFloatRes_FREEZE. Reviewers: bkramer, JamesNagurne, craig.topper, efriedma Reviewed By: craig.topper Subscribers: AbigailLinden, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76980	2020-03-31 10:16:38 +09:00
Jessica Paquette	63d70ea6a0	[GlobalISel] Combine (x op 0) -> x for operations with a right identity of 0 Implement identity combines for operations like the following: ``` %a = G_SUB %b, 0 ``` This can just be replaced with %b. Over CTMark, this gives some minor size improvements at -O3. Differential Revision: https://reviews.llvm.org/D76640	2020-03-30 16:49:52 -07:00
Eli Friedman	cf36f9855a	[SVE][SelectionDAG] Fix dumping of EVTs to use correct API for element count. This makes "-debug" output for SVE SelectionDAG readable.	2020-03-30 16:47:53 -07:00
Matt Arsenault	b8fc192d42	Revert "[GISel]: Fix incorrect IRTranslation while translating null pointer types" This reverts commit `b3297ef051`. This change is incorrect. The current semantic of null in the IR is a pointer with the bitvalue 0. It is not a cast from an integer 0, so this should preserve the pointer type.	2020-03-30 19:30:42 -04:00
Nick Desaulniers	f086941765	[SelectionDAGISel] small cleanup to INLINEASM_BR selection. NFC Summary: This code was throwing away the opcode for a boolean, which was then reconstructing the opcode from that boolean. Just pass the opcode, and forget the boolean. Reviewers: srhines Reviewed By: srhines Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77100	2020-03-30 15:32:06 -07:00
Matt Arsenault	4919f2e1c5	AMDGPU/GlobalISel: Basic legalize rules for G_FSHR Only handles easy 32-bit cases.	2020-03-30 11:53:01 -07:00
Matt Arsenault	23da702d69	GlobalISel: Translate llvm.fshl/llvm.fshr	2020-03-30 11:34:42 -07:00
Jakub Kuderski	77ce2e21a8	[AMDGPU] Add Relocation Constant Support Summary: This change adds amdgcn.reloc.constant intrinsic to the amdgpu backend, which will compile into a relocation entry in the resulting elf. The intrinsics takes a MetadataNode (String) as its only argument, which specifies the symbol name of the relocation entry. `SelectionDAGBuilder::getValueImpl` is changed to allow metadata operands passed through to ISel. Author: csyonghe <yonghe@google.com> Reviewers: tpr, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76440	2020-03-30 13:49:20 -04:00
Guillaume Chatelet	bdf77209b9	[Alignment][NFC] Use Align version of getMachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77059	2020-03-30 15:46:27 +00:00
Matt Arsenault	cc3b5590d2	GlobalISel: Minor cleanups	2020-03-30 11:26:22 -04:00
Guillaume Chatelet	01ba2ad9ef	[Alignment][NFC] Provide tightened up functions in SelectionDAG, MachineFunction and MachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77046	2020-03-30 13:03:27 +00:00
Guillaume Chatelet	b91535f6c7	[Alignment][NFC] Return Align for SelectionDAGNodes::getOriginalAlignment/getAlignment Summary: Also deprecate getOriginalAlignment, getAlignment will take much more time as it is pervasive through the codebase (including TableGened files). This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76933	2020-03-30 07:26:48 +00:00
Reid Kleckner	e5bf5037d8	[CodeGen] Fix sinking local values in lpads with phis There was already a test case for landingpads to handle this case, but I had forgotten to consider PHI instructions preceding the EH_LABEL in the landingpad. PR45261	2020-03-28 11:10:33 -07:00
Martin Storsjö	e6112a56dd	[AsmPrinter] Emit .weak directive for weak linkage on COFF for symbols without a comdat MC already knows how to emulate the .weak directive (with its ELF semantics; i.e., an undefined weak symbol resolves to 0, and a defined weak symbol has lower link precedence than a strong symbol of the same name) using COFF weak externals. Plumb this through the ASM printer too, so that definitions marked with __attribute__((weak)) at the language level (which gets translated to weak linkage at the IR level) have the corresponding .weak directive emitted. Note that declarations marked with __attribute__((weak)) at the language level (which translates to extern_weak at the IR level) already have .weak directives emitted. Weak/linkonce symbols without an associated comdat (in particular, ones generated with __attribute__((weak)) in C/C++) were earlier emitted as normal unique globals, as the comdat is required to provide the linkonce semantics. This change makes sure they are emitted as .weak instead, allowing other symbols to override them. Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not quite sure what that test covers, since the behavior being tested in it (the emission of a one_only section) is just a result of passing -function-sections to llc; the linkonce_odr makes no difference. Add a new coff-weak.ll which tests the new directive emission. Based on an previous patch by Shoaib Meenai. Differential Revision: https://reviews.llvm.org/D44543	2020-03-28 18:48:58 +02:00
Jessica Paquette	98d05f88d5	[GlobalISel] Fix equality for copies from physregs in matchEqualDefs When we see this: ``` %a = COPY $physreg ... SOMETHING implicit-def $physreg ... %b = COPY $physreg ``` The two copies are not equivalent, and so we shouldn't perform any folding on them. When we have two instructions which use a physical register check that they define the same virtual register(s) as well. e.g., if we run into this case ``` %a = COPY $physreg ... %b = COPY %a ``` we can say that the two copies are the same, and can be folded. Differential Revision: https://reviews.llvm.org/D76890	2020-03-27 17:52:21 -07:00
Nemanja Ivanovic	4821411347	[DAGCombine] Fix splitting indexed loads in ForwardStoreValueToDirectLoad() In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed load. However, we don't do the same checking in ForwardStoreValueToDirectLoad() which can lead to failures later during combining (see: https://bugs.llvm.org/show_bug.cgi?id=45301). This patch just adds the same checks to this function as well. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301 Differential revision: https://reviews.llvm.org/D76778	2020-03-27 18:03:47 -05:00
Matt Arsenault	a8cc9047de	CodeGen: Add -denormal-fp-math-f32 flag Make the set of FP related attributes and command flags closer.	2020-03-27 14:00:39 -07:00
Matt Arsenault	0ab5b5b858	Fix denormal-fp-math flag and attribute interaction Make these behave the same way unsafe-fp-math and co. The command line flag should add the attribute to functions that do not already have it, and leave existing attributes. The attribute is the actual implementation, but the flag is useful in some testing situations. AMDGPU has a variety of tests with denormals enabled/disabled that would require a painful level of test duplication without a flag. This doesn't expose setting the separate input/output modes, or add a flag for the f32 version yet. Tests will be included in future patch.	2020-03-27 12:48:58 -07:00
Guillaume Chatelet	74eac9031a	[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76925	2020-03-27 15:49:13 +00:00
Guillaume Chatelet	a98662f4c1	[Alignment][NFC] Update MachineMemOperand implementation to use MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76625	2020-03-27 08:06:10 +00:00
Juneyoung Lee	1bcc500b48	[DAGCombine] Add basic optimizations for FREEZE in SelDag Summary: This patch is the first effort to adding basic optimizations for FREEZE in SelDag. Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76707	2020-03-27 12:20:39 +09:00
Craig Topper	9f7d4150b9	[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG. These transforms rely on a vector reduction flag on the SDNode set by SelectionDAGBuilder. This flag exists because SelectionDAG can't see across basic blocks so SelectionDAGBuilder is looking across and saving the info. X86 is the only target that uses this flag currently. By removing the X86 code we can remove the flag and the SelectionDAGBuilder code. This pass adds a dedicated IR pass for X86 that looks across the blocks and transforms the IR into a form that the X86 SelectionDAG can finish. An advantage of this new approach is that we can enhance it to shrink the phi nodes and final reduction tree based on the zeroes that we need to concatenate to bring the partially reduced reduction back up to the original width. Differential Revision: https://reviews.llvm.org/D76649	2020-03-26 14:10:20 -07:00
diggerlin	fdfe411e7c	[AIX] discard the label in the csect of function description and use qualname for linkage SUMMARY: SUMMARY for a source file "test.c" void foo() {}; llc will generate assembly code as (assembly patch) .globl foo .globl .foo .csect foo[DS] foo: .long .foo .long TOC[TC0] .long 0 and symbol table as (xcoff object file) [4] m 0x00000004 .data 1 unamex foo [5] a4 0x0000000c 0 0 SD DS 0 0 [6] m 0x00000004 .data 1 extern foo [7] a4 0x00000004 0 0 LD DS 0 0 After first patch, the assembly will be as .globl foo[DS] # -- Begin function foo .globl .foo .align 2 .csect foo[DS] .long .foo .long TOC[TC0] .long 0 and symbol table will as [6] m 0x00000004 .data 1 extern foo [7] a4 0x00000004 0 0 DS DS 0 0 Change the code for the assembly path and xcoff objectfile patch for llc. Reviewers: Jason Liu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D76162	2020-03-26 15:46:52 -04:00
Dominik Montada	9fedb6900d	[GlobalISel] add helper function to create arbitrary libcalls Summary: The existing helper function can only create a libcall to functions available in RTLIB. Add a helper function that can create a libcall to a given function name using the provided calling convention. Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76845	2020-03-26 16:11:13 +01:00
Qiu Chaofan	172456c775	[Legalizer] Fix some flags miss in vector results In some scalarize/split result methods (unary, binary, ...), flags in SDNode were not passed down, which may lead to unexpected results in unsafe float-point optimization. This patch fixes them. (maybe not complete) Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D76832	2020-03-26 22:01:19 +08:00
Juneyoung Lee	453eac3f77	Minor fixes to a comment in CodeGenPrepare	2020-03-25 16:34:43 +09:00
Matt Arsenault	39c55cef21	GlobalISel: Introduce bitcast legalize action For some operations, the type is unimportant and only the number of bits matters. For example I don't want to treat <4 x s8> as a legal type, but I also don't want to decompose loads of this into smaller pieces to get legal register types. On AMDGPU in SelectionDAG, we legalize a number of operations (most notably load and store) by coercing all types to vectors of i32. For GlobalISel, I'm trying very hard to avoid doing this for every type, but I don't think this strategy can be completely avoided. I'm trying to avoid bitcasts for any legitimately legal type we can operate on, since the intervening bitcasts have proven to be a hassle. For loads, I think I can get away without ever casting the result type, and handling any arbitrary bitwidth during selection (I will eventually want new tablegen support to help with this, rather than having to add every possible type as legal). The unmerge required to do anything with the value should expand to the expected shifts. This is trickier for stores, since it would now require handling a wide array of truncates during selection which I don't want. Future potentially interesting case are for vector indexing, where sub-dword type should be indexed in s32 pieces.	2020-03-24 19:33:33 -04:00
Vedant Kumar	f7052da6db	[DWARF] Emit DW_AT_call_pc for tail calls Record the address of a tail-calling branch instruction within its call site entry using DW_AT_call_pc. This allows a debugger to determine the address to use when creating aritificial frames. This creates an extra attribute + relocation at tail call sites, which constitute 3-5% of all call sites in xnu/clang respectively. rdar://60307600 Differential Revision: https://reviews.llvm.org/D76336	2020-03-24 12:01:55 -07:00
Benjamin Kramer	0019c2f194	[SelectionDAG] Don't crash when freezing illegal float types	2020-03-24 19:45:19 +01:00
Hiroshi Yamauchi	c3417592c8	Revert "Include static prof data when collecting loop BBs" This reverts commit `129c911efa`. Due to an internal benchmark regression.	2020-03-24 09:41:16 -07:00
Lama	4a6ebc03ba	[MachinePipeliner] Fix a bug in Output Dependency chains The current implementation collects all Preds/Succs of a Dep of kind Output, creating a long chain and subsequently a schedule with an unnecessarily large II. Was this done on purpose for a reason I'm missing? Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D75424	2020-03-24 14:37:50 +00:00
Juneyoung Lee	7802be4a3d	[SelDag] Add FREEZE Summary: - Add FREEZE node to SelDag - Lower FreezeInst (in IR) to FREEZE node - Add Legalization for FREEZE node Reviewers: qcolombet, bogner, efriedma, lebedev.ri, nlopes, craig.topper, arsenm Reviewed By: lebedev.ri Subscribers: wdng, xbolva00, Petar.Avramovic, liuz, lkail, dylanmckay, hiraditya, Jim, arsenm, craig.topper, RKSimon, spatel, lebedev.ri, regehr, trentxintong, nlopes, mkuper, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D29014	2020-03-24 23:04:58 +09:00
Jinsong Ji	816ad48c82	[NFC][RUIP] Small debug output refine Add a new line, so that we always print MI in a new line, before and after UpdateRegMask, for easier check..	2020-03-24 03:29:45 +00:00
Jessica Paquette	02187ed45a	[GlobalISel] Combine G_SELECTs of the form (cond ? x : x) into x When we find something like this: ``` %a:_(s32) = G_SOMETHING ... ... %select:_(s32) = G_SELECT %cond(s1), %a, %a ``` We can remove the select and just replace it entirely with `%a` because it's always going to result in `%a`. Same if we have ``` %select:_(s32) = G_SELECT %cond(s1), %a, %b ``` where we can deduce that `%a == %b`. This implements the following cases: - `%select:_(s32) = G_SELECT %cond(s1), %a, %a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %some_copy_from_a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %b` -> `%a` when `%a` and `%b` are defined by identical instructions This gives a few minor code size improvements on CTMark at -O3 for AArch64. Differential Revision: https://reviews.llvm.org/D76523	2020-03-23 16:46:03 -07:00
Matt Arsenault	aa63eb6a46	GlobalISel: Add computeKnownBitsForTargetInstr I think we can save the MRI argument from these since it's in GISelKnownBits already, but currently not accessible. Implementation deferred to avoid dependency on other patches.	2020-03-23 15:02:30 -04:00
Reid Kleckner	5ff5ddd0ad	[Win64] Insert int3 into trailing empty BBs Otherwise, the Win64 unwinder considers direct branches to such empty trailing BBs to be a branch out of the function. It treats such a branch as a tail call, which can only be part of an epilogue. If the unwinder misclassifies such a branch as part of the epilogue, it will fail to unwind the stack further. This can lead to bad stack traces, or failure to handle exceptions properly. This is described in https://llvm.org/PR45064#c4, and by the comment at the top of the X86AvoidTrailingCallPass.cpp file. It should be safe to insert int3 for such blocks. An empty trailing BB that reaches this pass is pretty much guaranteed to be unreachable. If a program executed such a block, it would fall off the end of the function. Most of the complexity in this patch comes from threading through the "EHFuncletEntry" boolean on the MIRParser and registering the pass so we can stop and start codegen around it. I used an MIR test because we should teach LLVM to optimize away these branches as a follow-up. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D76531	2020-03-23 08:50:37 -07:00
Jay Foad	0444d16a16	[GlobalISel] Add generic opcodes for saturating add/subtract Summary: Add new generic MIR opcodes G_SADDSAT etc. Add support in IRTranslator for translating the saturating add/subtract intrinsics to the new opcodes. Reviewers: aemerson, dsanders, paquette, arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76600	2020-03-23 15:16:45 +00:00
Sanjay Patel	0eeee83d75	[VectorUtils] move x86's scaleShuffleMask to generic VectorUtils We have some long-standing missing shuffle optimizations that could use this transform via VectorCombine now: https://bugs.llvm.org/show_bug.cgi?id=35454 (and we still don't get that case in the backend either) This function is apparently templated because there's existing code in IR that treats mask values as unsigned and backend code that treats masks values as signed. The mask values are not endian-dependent (as shown by the existing bitcast transform from DAGCombiner). Differential Revision: https://reviews.llvm.org/D76508	2020-03-23 09:58:55 -04:00
Guillaume Chatelet	3ba550a05a	[Alignment][NFC] Use TFL::getStackAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: dylanmckay, sdardis, nemanjai, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76551	2020-03-23 13:48:29 +01:00
Guillaume Chatelet	ea64ee0edb	[Alignment][NFC] Deprecate ensureMaxAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76368	2020-03-23 11:31:33 +01:00
Jay Foad	7cdbf1ed4b	Make use of APInt::countLeadingOnes. NFC.	2020-03-23 09:08:20 +00:00
Sam Parker	62fdb1f534	[DAGCombine] Skip PostInc combine with later users When decided whether to generate a post-inc load/store, look at the other memory nodes that use the same base address and, if any proceed the current node, then don't do the combine. The change only seems to be affecting the Arm backend, which I was surprised at, but it appears to fix a lot of our issues around MVE masked load/stores having to store a temporary address after an early post-increment on a shared base address. Differential Revision: https://reviews.llvm.org/D75847	2020-03-23 08:39:53 +00:00
Sam Parker	8e45eaf1da	[NFC][DAGCombine] Refactor post-inc logic Extract the decision to combine into a post-inc address into a couple of functions to make the logic more clear and re-usable. Differential Revision: https://reviews.llvm.org/D76060	2020-03-23 08:32:20 +00:00
Dominik Montada	ccf49b9ef0	[GlobalISel] support widen unmerge if WideTy > SrcTy Summary: Widening G_UNMERGE_VALUES to a type which is larger than the original source type is the same as widening it to the same type as the source type: in both cases, G_UNMERGE_VALUES has to be replaced with bit arithmetic which. Although the arithmetic itself is independent of whether the source type is smaller or equal to the widen type, widening the source type to the widen type should result in less artifacts being emitted, since this is the type that the user explicitly requested. Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar Reviewed By: arsenm, dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76494	2020-03-23 09:16:45 +01:00
Qiu Chaofan	763871053c	[DAGCombiner] Require nsz for aggressive fma fold For folding pattern `x-(fma y,z,u*v) -> (fma -y,z,(fma -u,v,x))`, if `yz` is 1, `uv` is -1 and `x` is -0, sign of result would be changed. Differential Revision: https://reviews.llvm.org/D76419	2020-03-22 23:10:07 +08:00
Fangrui Song	71f8b78d89	[AsmPrinter] Simplify AsmPrinter::emitXXStructorList after D61547	2020-03-21 23:18:23 -07:00
Simon Pilgrim	c5fd9e3888	[DAG] Don't permit EXTLOAD when combining FSHL/FSHR consecutive loads (PR45265) Technically we can permit EXTLOAD of the LHS operand but only if all the extended bits are shifted out. Until we test coverage for that case, I'm just disabling this to fix PR45265.	2020-03-21 10:52:41 +00:00
Fangrui Song	85c30f3374	[X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile -fuse-init-array is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false. The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called. clang -target i386 -c a.c clang -target x86_64 -c a.c This cleanup fixes this as a bonus. X86SpeculativeLoadHardeningPass::tracePredStateThroughCall can call MCContext::createTempSymbol before TargetLoweringObjectFileELF::Initialize(). We need to call TargetLoweringObjectFileELF::Initialize() ealier. test/CodeGen/X86/speculative-load-hardening-indirect.ll Differential Revision: https://reviews.llvm.org/D71360	2020-03-20 21:57:34 -07:00
Eric Christopher	fc7233d774	Temporarily Revert "[X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile" as it's causing msan failures. This reverts commit `7899fe9da8`.	2020-03-20 17:36:12 -07:00
Vedant Kumar	a245943355	[LiveDebugValues] Speed up collectIDsForRegs, NFC Use the advanceToLowerBound operation available on CoalescingBitVector iterators to speed up collection of variables which reside within some set of registers. The speedup comes from avoiding repeated top-down traversals in IntervalMap::find. The linear scan forward from one register interval to the next is unlikely to be as expensive as a full IntervalMap search starting from the root. This reduces time spent in LiveDebugValues when compiling sqlite3 by 200ms (about 0.1% - 0.2% of the total User Time). Depends on D76466. rdar://60046261 Differential Revision: https://reviews.llvm.org/D76467	2020-03-20 12:18:26 -07:00
Vedant Kumar	4716ebb823	[ADT] CoalescingBitVector: Avoid initial heap allocation, NFC Avoid making a heap allocation when constructing a CoalescingBitVector. This reduces time spent in LiveDebugValues when compiling sqlite3 by 700ms (0.5% of the total User Time). rdar://60046261 Differential Revision: https://reviews.llvm.org/D76465	2020-03-20 12:18:25 -07:00
Fangrui Song	7899fe9da8	[X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile UseInitArray is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false. The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called. clang -target i386 -c a.c clang -target x86_64 -c a.c This cleanup fixes this as a bonus. Differential Revision: https://reviews.llvm.org/D71360	2020-03-20 11:18:36 -07:00
Vedant Kumar	636665331b	PR45181: Fix another invalid DIExpression combination The original test case from PR45181 triggers a DIExpression combination that wasn't fixed in D76164.	2020-03-20 11:18:05 -07:00
Pirama Arumuga Nainar	edcfb47ff6	[DAGCombiner] Do not fold truncate(build_vector(..)) if it creates an illegal type Summary: It can be the case that a vector type is legal but the corresponding scalar type is not legal for an architecture (i8 vs. v16i8 on AArch64). Check if the scalar type created when folding truncate(build_vector(x,y)) -> build_vector(truncate(x),truncate(y)) is legal if we are running after the type legalizer. This fixes https://github.com/android/ndk/issues/1207. Reviewers: RKSimon, srhines Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76312	2020-03-20 09:20:16 -07:00
Bjorn Pettersson	d168b77780	[DAGCombiner] Fix non-determinism problem related to argument evaluation order in visitFDIV Summary: For some reason the order in which we call getNegatedExpression for the involved operands, after a call to isCheaperToUseNegatedFPOps, seem to matter. This patch includes a new test case in test/CodeGen/X86/fdiv.ll that crashes if we reverse the order of those calls. Before this patch that could happen depending on which compiler that were used when buildind llvm. With my GCC version (7.4.0) I got the crash, because it seems like it is using a different order for the argument evaluation compared to clang. All other users of isCheaperToUseNegatedFPOps already used this pattern with unfolded/ordered calls to getNegatedExpression, so this patch is aligning visitFDIV with the other use cases. This patch simply deals with the non-determinism for FDIV. While the underlying problem with getNegatedExpression is discussed further in D76439. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76319	2020-03-20 16:11:17 +01:00
Adrian Kuegel	baa6f6a782	Revert "[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes" This reverts commit `e9f22fd429`. When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70 failing tests with this revision, and 29 without this revision.	2020-03-20 11:02:50 +01:00
Wei Mi	a035726e5a	Revert "Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore." This reverts commit `3c96d01d2e`. Got report that it caused test failures in libc++.	2020-03-19 22:45:27 -07:00
Jessica Paquette	c999084619	[GlobalISel] Port some basic shufflevector undef combines from the DAGCombiner Port over the following: - shuffle undef, undef, any_mask -> undef - shuffle anything, anything, undef_mask -> undef This sort of thing shows up a lot when you try to bugpoint code containing shufflevector. Differential Revision: https://reviews.llvm.org/D76382	2020-03-19 16:46:06 -07:00
Sanjay Patel	56da41393d	[SDAG] reduce code duplication in getNegatedExpression(); NFCI	2020-03-19 13:55:15 -04:00
Djordje Todorovic	d9b9621009	Reland D73534: [DebugInfo] Enable the debug entry values feature by default The issue that was causing the build failures was fixed with the D76164.	2020-03-19 13:57:30 +01:00
Cullen Rhodes	5ce38fcbac	[ValueTypes] Add support for scalable EVTs Summary: * Remove a bunch of asserts checking for unsupported scalable types and add some more now that they are supported. * Propagate the scalable flag where necessary. * Add another `EVT::getExtendedVectorVT` method that takes an ElementCount parameter. * Add `EVT::isExtendedScalableVector` and `EVT::getExtendedVectorElementCount` - latter is currently unused. Reviewers: sdesmalen, efriedma, rengolin, craig.topper, huntergr Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75672	2020-03-19 11:04:15 +00:00
Cullen Rhodes	5c296df0c0	[ValueTypes] Add EVT::isFixedLengthVector Summary: Related to D75672, this patch adds EVT::isFixedLengthVector to determine if the underlying vector type is of fixed length. An assert is introduced in EVT::getVectorNumElements that triggers for types that aren't fixed length. This is currently guarded by a flag added D75297 that is off by default and has been renamed to the more generic ENABLE_STRICT_FIXED_SIZE_VECTORS. Ideally we want to get rid of getVectorNumElements but a quick grep shows there are >350 uses in lib/CodeGen and 75 in lib/Target/AArch64 alone. All of these probably aren't EVT::getVectorNumElements (some may be the MVT equivalent), but there are many places to fixup and having the assert on by default would make the SVE upstreaming effort difficult. Reviewers: sdesmalen, efriedma, ctetreau, huntergr, rengolin Reviewed By: efriedma Subscribers: mgorny, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76376	2020-03-19 10:08:17 +00:00
Craig Topper	c69a4d6bef	[SelectionDAG] When splitting gathers/scatters in type legalization, set MMO size to UnknownSize Gather/scatter don't access one memory location, they access multiple disjoint locations. So using a fixed size isn't accurate. But we don't have a way to represent the true behavior so just use UnknownSize. Previously we "split" the memory VT and use that size for the MMO of each half. But the memory VT is scalar so splitting usually just returned the original scalar VT, but on 32-bit X86 if the scalar VT was i64 it probably returned i32? Differential Revision: https://reviews.llvm.org/D76388	2020-03-18 16:07:15 -07:00
Eli Friedman	e24e95fe90	Remove CompositeType class. The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660	2020-03-18 13:53:17 -07:00
Craig Topper	498b53890d	[SelectionDAGBuilder][FPEnv] Take into account SelectionDAG continuous CSE when setting the nofpexcept flag for constrained intrinsics SelectionDAG CSEs nodes based on their result type and operands, but not their flags. The flags are expected to be intersected when they are CSEd. In SelectionDAGBuilder, for FP nodes we manage both the fast math flags and the nofpexcept flag after the nodes have already been CSEd when they were created with getNode. The management of the fastmath flags before the constrained nodes prevents the nofpexcept management from working correctly. This commit moves the FMF handling for constrained intrinsics into their visitor and disables the common FMF handling for these nodes. Differential Revision: https://reviews.llvm.org/D75224	2020-03-18 13:37:17 -07:00
lewis-revill	e9f22fd429	[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes This patch generates TableGen descriptions for the specified register banks which contain a list of register sizes corresponding to the available HwModes. The appropriate size is used during codegen according to the current HwMode. As this HwMode was not available on generation, it is set upon construction of the RegisterBankInfo class. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Differential Revision: https://reviews.llvm.org/D76007	2020-03-18 19:52:23 +00:00
Simon Pilgrim	746bd860c9	Replace getAlignment() methods with getAlign() equivalents. Fixes deprecation warning in EXPENSIVE_CHECKS builds.	2020-03-18 18:25:07 +00:00
Jessica Paquette	dc5f982639	[GlobalISel] Port some basic undef combines from DAGCombiner.cpp This ports some combines from DAGCombiner.cpp which perform some trivial transformations on instructions with undef operands. Not having these can make it extremely annoying to find out where we differ from SelectionDAG by looking at existing lit tests. Without them, we tend to produce pretty bad code generation when we run into instructions which use undef operands. Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since we no longer fall back on the add. Differential Revision: https://reviews.llvm.org/D76339	2020-03-18 11:05:44 -07:00
Jin Lin	0d896278c8	Support repeated machine outlining Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027	2020-03-18 10:48:52 -07:00
Oliver Stannard	73cea83a6f	[IPRA][ARM] Spill extra registers at -Oz When optimising for code size at the expense of performance, it is often worth saving and restoring some of r0-r3, if IPRA will be able to take advantage of them. This doesn't cost any extra code size if we already have a PUSH/POP pair, and increases the number of available registers across any calls to the function. We already have an optimisation which tries fold the subtract/add of the SP into the PUSH/POP by using extra registers, which somewhat conflicts with this. I've made the new optimisation less aggressive in cases where the existing one is likely to trigger, which gives better results than either of these optimisations by themselves. Differential revision: https://reviews.llvm.org/D69936	2020-03-18 13:51:16 +00:00
Guillaume Chatelet	d000655a8c	[Alignment][NFC] Deprecate getMaxAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76348	2020-03-18 14:48:45 +01:00
Danila Malyutin	940ba1465b	Fix possible assertion when using PBQP with debug info Skip debug instructions before calling functions not expecting them. In particular, LIS.getInstructionIndex(*mi) would fail if mi was a debg instr. Differential Revision: https://reviews.llvm.org/D76129	2020-03-18 15:29:42 +03:00
Guillaume Chatelet	c3df69faa0	[Alignment][NFC] Deprecate getTransientStackAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76301	2020-03-18 09:02:48 +01:00
QingShan Zhang	d577193c0f	[DAGCombine] Respect the uses when combine FMA for ab+/-cd If it is ab-cd, it could be also folded into fma(a, b, -cd) or fma(-c, d, ab). This patch is trying to respect the uses of ab and cd to make the best choice. Differential Revision: https://reviews.llvm.org/D75982	2020-03-18 03:34:27 +00:00
Jin Lin	7b166d5182	Revert "Support repeated machine outlining" This reverts commit `ab2dcff309`.	2020-03-17 18:33:55 -07:00
Jin Lin	ab2dcff309	Support repeated machine outlining Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027	2020-03-17 18:11:08 -07:00
Simon Pilgrim	68224c1952	[TargetLowering] Only demand a rotation's modulo amount bits ISD::ROTL/ROTR rotation values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits. Differential Revision: https://reviews.llvm.org/D76201	2020-03-17 21:23:46 +00:00
Vedant Kumar	526c51e6fd	[DwarfDebug] Fix an assertion error when emitting call site info that combines two DW_OP_stack_values When compiling ``` struct S { float w; }; void f(long w, long b); void g(struct S s) { int w = s.w; f(w, w*4); } ``` I get Assertion failed: ((!CombinedExpr \|\| CombinedExpr->isValid()) && "Combined debug expression is invalid"). That's because we combine two epxressions that both end in DW_OP_stack_value: ``` (lldb) p Expr->dump() !DIExpression(DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value) (lldb) p Param.Expr->dump() !DIExpression(DW_OP_constu, 4, DW_OP_mul, DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value) (lldb) p CombinedExpr->isValid() (bool) $0 = false (lldb) p CombinedExpr->dump() !DIExpression(4097, 32, 5, 4097, 64, 5, 16, 4, 30, 4097, 32, 5, 4097, 64, 5, 159, 159) ``` I believe that in this particular case combining two stack values is safe, but I didn't want to sink the special handling into DIExpression::append() because I do want everyone to think about what they are doing. Patch by Adrian Prantl. Fixes PR45181. rdar://problem/60383095 Differential Revision: https://reviews.llvm.org/D76164	2020-03-17 12:51:49 -07:00
Scott Constable	080dd10f7d	Move RDF from Hexagon to Codegen RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86. Based on a previous patch by Krzysztof Parzyszek Differential Revision: https://reviews.llvm.org/D75932	2020-03-17 12:43:14 -07:00
Craig Topper	98369178bc	[SelectionDAGBuilder] Don't set MachinePointerInfo for gather when we find a uniform base I believe we were previously calculating a pointer info with the scalar base and an offset of 0. But that's not really where the gather is pointing. The offset is a function of the indices of the GEP we looked through. Also set the size of the MachineMemOperand to UnknownSize Differential Revision: https://reviews.llvm.org/D76157	2020-03-17 11:03:45 -07:00
Jin Lin	b9f1b8be1c	Revert "Support repeated machine outlining" This reverts commit `1f93b162fc`.	2020-03-17 10:03:27 -07:00
Jin Lin	1f93b162fc	Support repeated machine outlining Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027	2020-03-17 09:16:11 -07:00
Simon Pilgrim	c9656a3b31	[DAGCombiner] matchRotateSub - handle shift amount truncation Under certain circumstances we'll end up in the position where the negated shift amount will get truncated to the type specified getScalarShiftAmountTy(), so we need to test for a truncated version of the shift amount as well. This allows us to remove half of the remaining patterns tested for by X86ISelLowering's combineOrShiftToFunnelShift.	2020-03-17 16:01:23 +00:00
serge-sans-paille	ac1d23ed7d	Replace MCTargetOptionsCommandFlags.inc and CommandFlags.inc by runtime registration MCTargetOptionsCommandFlags.inc and CommandFlags.inc are headers which contain cl::opt with static storage. These headers are meant to be incuded by tools to make it easier to parametrize codegen/mc. However, these headers are also included in at least two libraries: lldCommon and handle-llvm. As a result, when creating DYLIB, clang-cpp holds a reference to the options, and lldCommon holds another reference. Linking the two in a single executable, as zig does[0], results in a double registration. This patch explores an other approach: the .inc files are moved to regular files, and the registration happens on-demand through static declaration of options in the constructor of a static object. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1756977#c5 Differential Revision: https://reviews.llvm.org/D75579	2020-03-17 14:01:30 +01:00
John Brawn	c09368313c	[StackProtector] Catch direct out-of-bounds when checking address-takenness With -fstack-protector-strong we check if a non-array variable has its address taken in a way that could cause a potential out-of-bounds access. However what we don't catch is when the address is directly used to create an out-of-bounds memory access. Fix this by examining the offsets of GEPs that are ultimately derived from allocas and checking if the resulting address is out-of-bounds, and by checking that any memory operations using such addresses are not over-large. Fixes PR43478. Differential revision: https://reviews.llvm.org/D75695	2020-03-17 12:09:07 +00:00
Michael Liao	d00d6a19dd	Fix `-Wpedantic` warning. NFC.	2020-03-16 22:06:23 -04:00
Sriraman Tallam	df082ac45a	Basic Block Sections support in LLVM. This is the second patch in a series of patches to enable basic block sections support. This patch adds support for: * Creating direct jumps at the end of basic blocks that have fall through instructions. * New pass, bbsections-prepare, that analyzes placement of basic blocks in sections. * Actual placing of a basic block in a unique section with special handling of exception handling blocks. * Supports placing a subset of basic blocks in a unique section. * Support for MIR serialization and deserialization with basic block sections. Parent patch : D68063 Differential Revision: https://reviews.llvm.org/D73674	2020-03-16 16:06:54 -07:00
Matt Arsenault	2e77362626	GlobalISel: Fix lower bswap for vectors This would hit an assertion from trying to use the wrong bitwidth for the constants.	2020-03-16 13:59:08 -04:00
Juneyoung Lee	07a41544fd	Minor fix to a comment in CodeGenPrepare.cpp	2020-03-17 01:10:26 +09:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Dominik Montada	8ff2dcb18b	[GlobalISel] add additional lowering support for G_INSERT Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994	2020-03-16 16:27:17 +01:00
Simon Pilgrim	2b3b453a82	[TargetLowering] Only demand a funnelshift's modulo amount bits ISD::FSHL/FSHR shift amount values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits.	2020-03-16 13:52:17 +00:00
Juneyoung Lee	7aecf2323c	[ExpandMemCmp] Correctly set alignment of generated loads Summary: This is a part of the series of efforts for correcting alignment of memory operations. (Another related bugs: https://bugs.llvm.org/show_bug.cgi?id=44388 , https://bugs.llvm.org/show_bug.cgi?id=44543 ) This fixes https://bugs.llvm.org/show_bug.cgi?id=43880 by giving default alignment of loads to 1. The test CodeGen/AArch64/bcmp-inline-small.ll should have been changed; it was introduced by https://reviews.llvm.org/D64805 . I talked with @evandro, and confirmed that the test is okay to be changed. Other two tests from PowerPC needed changes as well, but fixes were straightforward. Reviewers: courbet Reviewed By: courbet Subscribers: nlopes, gchatelet, wuzish, nemanjai, kristof.beyls, hiraditya, steven.zhang, danielkiss, llvm-commits, evandro Tags: #llvm Differential Revision: https://reviews.llvm.org/D76113	2020-03-16 22:39:48 +09:00
Juneyoung Lee	6ad63606ea	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not frozen, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-16 12:46:20 +09:00
Juneyoung Lee	4ffe3ac729	Revert "[CodeGenPrepare] Freeze condition when transforming select to br" This reverts commit `10aa7ea951`.	2020-03-16 12:45:54 +09:00
Simon Pilgrim	5641804298	[DAG] MatchRotate - Add funnel shift by variable support Followup to D75114, this patch reuses the existing MatchRotate ROTL/ROTR rotation pattern code to also recognize the more general FSHL/FSHR funnel shift patterns when we have variable shift amounts, matched with MatchFunnelPosNeg which acts in an (almost) equivalent manner to MatchRotatePosNeg.	2020-03-15 11:50:45 +00:00
Juneyoung Lee	10aa7ea951	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not freezed, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-15 11:10:46 +09:00
Brian Cain	ad7b930bd1	Initialize IsFast* values We must initialize these values in case some targets do not assign to them in allowsMemoryAccess().	2020-03-13 17:46:32 -05:00
Craig Topper	431df3d873	[SelectionDAGBuilder] Simplify the struct type handling in getUniformBase.	2020-03-13 14:00:21 -07:00
Nico Weber	f82b32a51e	Revert "Reland "[DebugInfo] Enable the debug entry values feature by default"" This reverts commit `5aa5c943f7`. Causes clang to assert, see https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c4 for a repro.	2020-03-13 15:37:44 -04:00
Juneyoung Lee	c39cb1c0dd	[CodeGenPrepare] Expand freeze conversion to support fcmp and icmp with null Summary: This is a simple patch that expands https://reviews.llvm.org/D75859 to pointer comparison and fcmp Checked with Alive2 Reviewers: reames, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76048	2020-03-13 17:21:33 +09:00
QingShan Zhang	e601196833	[NFC][DAGCombine] Move the fold of ab-c and a-bc into lambda function This will help the review of https://reviews.llvm.org/D75982. It is a simple code refactor.	2020-03-13 02:35:46 +00:00
Arlo Siemsen	1478ed69d3	Add support for SHA256 source file checksums in debug info LLVM currently supports CSK_MD5 and CSK_SHA1 source file checksums in debug info. This change adds support for CSK_SHA256 checksums. The SHA256 checksums are supported by the CodeView debug format. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D75785	2020-03-12 16:32:05 -07:00
Huihui Zhang	118abf2017	[SVE] Update API ConstantVector::getSplat() to use ElementCount. Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386	2020-03-12 13:22:41 -07:00
Simon Pilgrim	2a2d242017	[DAGCombine] foldVSelectOfConstants - ensure constants are same type Fix bug identified by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21167, foldVSelectOfConstants must ensure that the 2 build vectors have scalars of the same type before trying to compare APInt values.	2020-03-12 20:02:05 +00:00
Thomas Lively	4e589e6c26	[WebAssembly] Fix SIMD shift unrolling to avoid assertion failure Summary: Using the default DAG.UnrollVectorOp on v16i8 and v8i16 vectors results in i8 or i16 nodes being inserted into the SelectionDAG. Since those are illegal types, this causes a legalization assertion failure for some code patterns, as uncovered by PR45178. This change unrolls shifts manually to avoid this issue by adding and using a new optional EVT argument to DAG.ExtractVectorElements to control the type of the extract_element nodes. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76043	2020-03-12 12:20:14 -07:00
Marcello Maggioni	ba5500f27a	[RAGreedy] Fix minor typo in comment. NFC	2020-03-12 08:15:04 -07:00
Andrzej Warzynski	46b9f14d71	[AArch64][SVE] Add intrinsics for non-temporal scatters/gathers Summary: This patch adds the following intrinsics for non-temporal gather loads and scatter stores: * aarch64_sve_ldnt1_gather_index * aarch64_sve_stnt1_scatter_index These intrinsics implement the "scalar + vector of indices" addressing mode. As opposed to regular and first-faulting gathers/scatters, there's no instruction that would take indices and then scale them. Instead, the indices for non-temporal gathers/scatters are scaled before the intrinsics are lowered to `ldnt1` instructions. The new ISD nodes, GLDNT1_INDEX and SSTNT1_INDEX, are only used as placeholders so that we can easily identify the cases implemented in this patch in performGatherLoadCombine and performScatterStoreCombined. Once encountered, they are replaced with: * GLDNT1_INDEX -> SPLAT_VECTOR + SHL + GLDNT1 * SSTNT1_INDEX -> SPLAT_VECTOR + SHL + SSTNT1 The patterns for lowering ISD::SHL for scalable vectors (required by this patch) were missing, so these are added too. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D75601	2020-03-12 13:55:56 +00:00
Dominik Montada	6b96623dcb	[GlobalISel] fix crash in narrowScalarExtract if DstRegs only has one register Summary: When narrowing a scalar G_EXTRACT where the destination lines up perfectly with a single result of the emitted G_UNMERGE_VALUES a COPY should be emitted instead of unconditionally trying to emit a G_MERGE_VALUES. Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75743	2020-03-12 09:14:35 +01:00
Tres Popp	bbe6764711	Remove unused variable. Delete dead code from `8fffa40400`.	2020-03-12 08:42:57 +01:00
Philip Reames	8fffa40400	[GC] Remove redundant entiries in stackmap section (and test it this time) This is a reimplementation of the optimization removed in D75964. The actual spill/fill optimization is handled by D76013, this one just worries about reducing the stackmap section size itself by eliminating redundant entries. As noted in the comments, we could go a lot further here, but avoiding the degenerate invoke case as we did before is probably "enough" in practice. Differential Revision: https://reviews.llvm.org/D76021	2020-03-11 21:24:48 -07:00
Bill Wendling	6aebf0ee56	Specify branch probabilities for callbr dests Summary: callbr's indirect branches aren't expected to be taken, so reduce their probabilities to 0 while increasing the default destination to 1. This allows some code improvements through block placement. Reviewers: nickdesaulniers Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72656	2020-03-11 20:33:48 -07:00
Adrian Prantl	d5180ea134	Add debug info support for Swift/Clang APINotes. In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF. <rdar://problem/59514626> Differential Revision: https://reviews.llvm.org/D75585	2020-03-11 18:47:30 -07:00
Adrian Prantl	e4e7e44765	Add an SDK attribute to DICompileUnit This is part of PR44213 https://bugs.llvm.org/show_bug.cgi?id=44213 When importing (system) Clang modules, LLDB needs to know which SDK (e.g., MacOSX, iPhoneSimulator, ...) they came from. While the sysroot attribute contains the absolute path to the SDK, this doesn't work well when the debugger is run on a different machine than the compiler, and the SDKs are installed in different directories. It thus makes sense to just store the name of the SDK instead of the absolute path, so it can be found relative to LLDB. rdar://problem/51645582 Differential Revision: https://reviews.llvm.org/D75646	2020-03-11 14:14:06 -07:00
Jin Lin	a0cacb6054	Fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode Summary: The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode. The purpose is to provide the support of LTO for swift and Objective-C mixed project. Reviewers: rjmccall, ahatanak, steven_wu Reviewed By: rjmccall, steven_wu Subscribers: manmanren, mehdi_amini, hiraditya, dexonsmith, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71219	2020-03-11 13:26:06 -07:00
Philip Reames	8f997b4f01	[GC] Loosen ordering on statepoint reloads to allow CSE We just removed a broken duplicate elimination algorithm in D75964, and after landed that it occurred to me that duplicate elimination is simply CSE. SelectionDAG has a build in CSE, so why wasn't that triggering? Well, it turns out we were overly conservative in the memory states for our reloads and CSE (rightly) considers the incoming memory state for a load part of the identity of the load. By loosening the chain and allowing reordering, we also allow CSE. As shown in the test case, doing iterative CSE as we go is enough to eliminate duplicate stores in later statepoints as well. We key our (block local) slot map by SDValue, so commoning a previous pair of loads at construction time means we also common following stores. Differential Revision: https://reviews.llvm.org/D76013	2020-03-11 12:30:06 -07:00
Simon Pilgrim	d8f9416fdc	[DAG] MatchRotate - Add funnel shift by immediate support This patch reuses the existing MatchRotate ROTL/ROTR rotation pattern code to also recognize the more general FSHL/FSHR funnel shift patterns when we have constant shift amounts. Differential Revision: https://reviews.llvm.org/D75114	2020-03-11 18:55:18 +00:00
Juneyoung Lee	8eb2f865c3	[CodeGenPrepare] Fold br(freeze(icmp x, const)) to br(icmp(freeze x, const)) Summary: This patch helps CodeGenPrepare move freeze into the icmp when it is used by branch. It reenables generation of efficient conditional jumps. This is only done when at least one of icmp's operands is constant to prevent the transformation from increasing # of freeze instructions. Performance degradation of MultiSource/Benchmarks/Ptrdist/yacr2/yacr2.test is resolved with this patch. Checked with Alive2 Reviewers: reames, fhahn, nlopes Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75859	2020-03-12 03:16:15 +09:00
Philip Reames	e671641844	[GC] Remove buggy untested optimization from statepoint lowering A downstream test case (see included reduced test) revealed that we have a bug in how we handle duplicate relocations. If we have the same SDValue relocated twice, and that value happens to be a constant (such as null), we only export one of the two llvm::Values. Exporting on a per llvm::Value basis is required to allow lowering of gc.relocates in following basic blocks (e.g. invokes). Without it, we end up with a use of an undefined vreg and bad things happen. Rather than fixing the optimization - which appears to be hard - I propose we simply remove it. There are no tests in tree that change with this code removed. If we find out later that this did matter for something, we can reimplement a variation of this in CodeGenPrepare to catch the easy cases without complicating the lowering code. Thanks to Denis and Serguei who did all the hard work of figuring out what went wrong here. The patch is by far the easy part. :) Differential Revision: https://reviews.llvm.org/D75964	2020-03-11 10:03:24 -07:00
Matt Arsenault	c0ad75e758	GlobalISel: Don't try to narrow extending loads/trunc store If the loaded memory size was smaller than the result size, this would produce out of bounds memory accesses. I'm wondering if we need a distinct narrow memory legalize action type, since a case I care about is decomposing a 4-byte unaligned access into 4 extending loads, which would leave the original result register type. I'm currently awkwardly using narrowScalar to handle unaligned accesses that need to be split.	2020-03-10 23:34:10 -04:00
Matt Arsenault	b17a81f8b2	GlobalISel: Add missing add/sub with carries to MachineIRBuilder	2020-03-10 22:39:55 -04:00
Matt Arsenault	ce8a1f7294	GlobalISel: Implement fewerElementsVector for G_TRUNC Extend fewerElementsVectorBasic to handle operands with different element types.	2020-03-10 15:17:20 -07:00
Benjamin Kramer	247a177cf7	Give helpers internal linkage. NFC.	2020-03-10 18:27:42 +01:00
Kazushi (Jam) Marukawa	3dabad1af3	[VE] Target-specific bit size for sjljehprepare Summary: This patch extends the TargetMachine to let targets specify the integer size used by the sjljehprepare pass. This is 64bit for the VE target and otherwise defaults to 32bit for all targets, which was hard-wired before. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D71337	2020-03-10 17:51:16 +01:00
Simon Pilgrim	e71fb46a8f	[TargetLowering] SimplifyDemandedVectorElts - add DemandedElts mask to ISD::BITCAST SimplifyDemandedBits call. This fixes most of the regressions introduced in the rG4bc6f6332028 bugfix. The vector-trunc.ll issue should be fixed by D66004.	2020-03-10 13:39:10 +00:00
Djordje Todorovic	5aa5c943f7	Reland "[DebugInfo] Enable the debug entry values feature by default" Differential Revision: https://reviews.llvm.org/D73534	2020-03-10 09:15:06 +01:00
Puyan Lotfi	4b8af31f63	[llvm][MIRVRegNamer] Avoid collisions across constant pool indices. When hashing on MachineOperand::MO_ConstantPoolIndex, now MIR-Canon and MIRVRegNamer will no longer result in a hash collision. Differential Revision: https://reviews.llvm.org/D74449	2020-03-10 01:13:20 -04:00
Marcello Maggioni	e5205074df	Move Spiller.h from lib/ directory path to include/CodeGen. NFC This allows Spiller.h to be used and included outside of the lib/CodeGen directory. For example to be used in the lib/Target directory or other places.	2020-03-09 10:52:28 -07:00
Djordje Todorovic	c15c68abdc	[CallSiteInfo] Enable the call site info only for -g + optimizations Emit call site info only in the case of '-g' + 'O>0' level. Differential Revision: https://reviews.llvm.org/D75175	2020-03-09 12:12:44 +01:00
Clement Courbet	6518b72f93	[ExpandMemCmp] Properly constant-fold all compares. Summary: This gets rid of duplicated code and diverging behaviour w.r.t. constants. Fixes PR45086. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75519	2020-03-09 10:40:52 +01:00
Clement Courbet	f7e6f5f8e3	[ExpandMemCmp] Properly constant-fold all compares. Summary: This gets rid of duplicated code and diverging behaviour w.r.t. constants. Fixes PR45086. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75519	2020-03-09 09:10:34 +01:00
Matt Arsenault	a4e71f01c0	Assume ieee behavior without denormal-fp-math attribute	2020-03-07 12:10:56 -05:00
Amara Emerson	c1a97e992d	Revert "Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses."" This reverts commit `5583c2f2fb`. The lldb bot failure was a test that was fragile and sensitive to irrelevant changes in instruction ordering. Re-committing this as the test should have been skipped for AArch64 now. Differential Revision: https://reviews.llvm.org/D75555	2020-03-06 21:35:08 -08:00
Jin Lin	fc6fda90f7	Fix incorrect logic in maintaining the side-effect of compiler generated outliner functions Summary: Fix incorrect logic in maintaining the side-effect of compiler generated outliner functions by adding the up-exposed uses. Reviewers: paquette, tellenbach Reviewed By: paquette Subscribers: aemerson, lebedev.ri, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71217	2020-03-06 09:13:20 -08:00
Xiangling Liao	362456bc53	[AIX] Handle LinkOnceODRLinkage and AppendingLinkage for static init gloabl arrays Handle LinkOnceODRLinkage; Handle AppendingLinkage type for llvm.global_ctors/dtors static init global arrays; Differential Revision: https://reviews.llvm.org/D75305	2020-03-06 09:26:55 -05:00
Simon Pilgrim	7202d9cde9	[DAG] Combine fshl/fshr(load1,load0,c) if we have consecutive loads As noted on D75114, if both arguments of a funnel shift are consecutive loads we are missing the opportunity to combine them into a single load. Differential Revision: https://reviews.llvm.org/D75624	2020-03-06 11:36:18 +00:00
Dominik Montada	feb20a1594	[GlobalISel] add missing libcalls and 128-bit support for floating points Add libcall support for G_FMINNUM, G_FMAXNUM, G_FSQRT, G_FRINT, G_FNEARBYINT. Add 128-bit libcall support for all simple libcalls. Reviewers: arsenm, Petar.Avramovic, dsanders, petarj, paquette Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D75516	2020-03-06 09:06:13 +01:00
Hiroshi Yamauchi	76b9901fb1	[PGO][PGSO] Use IsColdXNthPercentile for sample PGO. Summary: This performs better for sample PGO. NFC as PGSOColdCodeOnlyForSamplePGO is still true. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75550	2020-03-05 09:54:54 -08:00
QingShan Zhang	3906ae387f	[DAGCombine] Check the uses of negated floating constant and remove the hack PowerPC hits an assertion due to somewhat the same reason as https://reviews.llvm.org/D70975. Though there are already some hack, it still failed with some case, when the operand 0 is NOT a const fp, it is another fma that with const fp. And that const fp is negated which result in multi-uses. A better fix is to check the uses of the negated const fp. If there are already use of its negated value, we will have benefit as no extra Node is added. Differential revision: https://reviews.llvm.org/D75501	2020-03-05 03:42:50 +00:00
Muhammad Omair Javaid	5583c2f2fb	Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses." This reverts commit `e91e1df6ab`.	2020-03-05 03:12:28 +05:00
Matt Arsenault	b71203a751	GlobalISel: Move some legalizer functions to utils	2020-03-04 16:40:00 -05:00
Matt Arsenault	fb0c35fa34	GlobalISel: Set alignment on function argument stack load/store	2020-03-04 16:38:46 -05:00
Wei Mi	3c96d01d2e	Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore. https://reviews.llvm.org/D42848 only handled CFA related cfi directives but didn't handle CSR related cfi. The patch adds the CSR part. Basically it reuses the framework created in D42848. For each basicblock, the patch tracks which CSR set have been saved at its CFG predecessors's exits, and compare the CSR set with the set at its previous basicblock's exit (The previous block is the block laid before the current block). If the saved CSR set at its previous basicblock's exit is larger, .cfi_restore will be inserted. The patch also generates proper .cfi_restore in epilogue to make sure the saved CSR set is consistent for the incoming edges of each block. Differential Revision: https://reviews.llvm.org/D74303	2020-03-04 11:18:37 -08:00
Guozhi Wei	ee9a3eba76	[CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen. This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case. Differential Revision: https://reviews.llvm.org/D74242	2020-03-04 11:10:32 -08:00
Nikita Popov	0e890cd4d4	[ConstantFolding] Always return something from ConstantFoldConstant Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543	2020-03-04 18:24:47 +01:00
Sanjay Patel	29a2b20ab3	[SDAG] simplify FP binops to undef As discussed in the commit thread for rGa253a2a and D73978, we can do more undef folding for FP ops. The nnan and ninf fast-math-flags specify that if an operand is the disallowed value, the result is poison, so we can produce an undef result. But this doesn't work as expected (the undef operand cases remain) because of a Flags propagation problem in SelectionDAGBuilder. I've added DAGCombiner calls to enable these for the other cases because we've shown in other patches that (because of the limited way that SDAG iterates), it is possible to miss simplifications like this if they are done only at node creation time. Several potential follow-ups to expand on this patch are possible. Differential Revision: https://reviews.llvm.org/D75576	2020-03-04 10:42:16 -05:00
Amara Emerson	e91e1df6ab	[GlobalISel][Localizer] Enable intra-block localization of already-local uses. This changes the localizer to attempt intra-block localizer of instructions that have local uses. This is useful because sometimes the entry block itself has many uses of constant-like instructions, which would benefit from shortening live ranges. Previously if an inst had no non-local uses, we wouldn't add it to the list of instructions to attempt further intra-block localization. This gives a 0.7% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D75555	2020-03-03 18:14:57 -08:00
Fangrui Song	90acc505ed	[MCDwarf] Change emitListsTableHeaderStart to use a reference and fold Start/End symbols generation into it Apply @dblaikie's suggestions in a post-commit review for D75375 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D75568	2020-03-03 16:20:40 -08:00
Amy Huang	5b3b21f025	[DebugInfo] Fix for adding "returns cxx udt" option to functions in CodeView. Summary: This change checks for the return type in the frontend and adds a flag to the DISubroutineType to indicate that the option should be added in CodeViewDebug. Previously function types sometimes appeared twice in the PDB: once with "returns cxx udt" and once without. See https://bugs.llvm.org/show_bug.cgi?id=44785. Reviewers: rnk, asmith Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75215	2020-03-03 14:00:08 -08:00
Vedant Kumar	f002ee55c7	[MachineVerifier] Remove placement rule exception for debug entry values There should not be an exception allowing debug entry values to be placed after a terminator. Differential Revision: https://reviews.llvm.org/D75559	2020-03-03 13:02:18 -08:00
Vedant Kumar	2bf496620c	[LiveDebugValues] Do not insert DBG_VALUEs after a MBB terminator This fixes a miscompile that happened because a DBG_VALUE interfered with the MachineOutliner's liveness analysis. Inserting a DBG_VALUE after a terminator breaks predicates on MBB such as isReturnBlock(). And the resulting DBG_VALUE cannot be "live". I plan to introduce a MachineVerifier check for this situation in a follow up. rdar://59859175 Testing: check-llvm, LNT build with a stage2 compiler & entry values enabled Differential Revision: https://reviews.llvm.org/D75548	2020-03-03 13:00:52 -08:00
Fangrui Song	55a56041d1	[MCDwarf] Generate DWARF v5 .debug_rnglists for assembly files ``` // clang -c -gdwarf-5 a.s -o a.o .section .init; ret .text; ret ``` .debug_info contains DW_AT_ranges and llvm-dwarfdump will report a verification error because .debug_rnglists does not exist (not implemented). This patch generates .debug_rnglists for assembly files. emitListsTableHeaderStart() in DwarfDebug.cpp can be shared with MCDwarf.cpp. Because CodeGen depends on MC, I move the function to MCDwarf.cpp Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D75375	2020-03-03 09:03:34 -08:00
Craig Topper	d8ad7cc088	[DAGCombiner][X86] Improve narrowExtractedVectorLoad to handle cases where the element size isn't byte sized by the subvector is. Summary: Follow up from D75377. If the subvector is byte sized and the index is aligned to the subvector size, we can shrink the load. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: dbabokin, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75434	2020-03-03 08:41:31 -08:00
Sam Parker	5618e9be37	[RDA][ARM] collectKilledOperands across multiple blocks Use MIOperand in collectLocalKilledOperands to make the search global, as we already have to search for global uses too. This allows us to delete more dead code when tail predicating. Differential Revision: https://reviews.llvm.org/D75167	2020-03-03 15:23:05 +00:00
Sam Parker	dfe8f5da4c	[ARM][RDA] Allow multiple killed users In RDA, check against the already decided dead instructions when looking at users. This allows an instruction to be removed if it has multiple users, but they're all dead. This means that IT instructions can be considered killed once all the itstate using instructions are dead. Differential Revision: https://reviews.llvm.org/D75245	2020-03-03 15:12:29 +00:00
Clement Courbet	b0ae20d92e	[ExpandMemCmp][NFC] Fix typo in comment.	2020-03-03 11:07:13 +01:00
Awanish Pandey	1cb0e01e42	[DebugInfo][DWARF5]: Added support for debuginfo generation for defaulted parameters This patch adds support for dwarf emission/dumping part of debuginfo generation for defaulted parameters. Reviewers: probinson, aprantl, dblaikie Reviewed By: aprantl, dblaikie Differential Revision: https://reviews.llvm.org/D73462	2020-03-03 13:09:53 +05:30
Vedant Kumar	d64a22a2ad	[LiveDebugValues] Prevent some misuse of LocIndex::fromRawInteger, NFC Make it a compile-time error to pass an int/unsigned/etc to fromRawInteger. Hopefully this prevents errors of the form: ``` for (unsigned ID : getVarLocs()) { auto VL = LocMap[LocIndex::fromRawInteger(ID)]; ... ```	2020-03-02 16:59:09 -08:00
Jordan Rupprecht	d7803c3832	Add default case to fix -Wswitch errors	2020-03-02 14:23:46 -08:00
Craig Topper	adc69729ec	[TargetLowering] Fix what look like copy/paste mistakes in compare with infinity handling SimplifySetCC. I expect that the isCondCodeLegal checks should match that CC of the node that we're going to create. Rewriting to a switch to minimize repeated mentions of the same constants.	2020-03-02 14:12:16 -08:00
Stanislav Mekhanoshin	1bacdcf48d	Extend LaneBitmask to 64 bit This is needed for D74873, AMDGPU going to have 16 bit subregs and the largest tuple is 32 VGPRs, which results in 64 lanes. Differential Revision: https://reviews.llvm.org/D75378	2020-03-02 12:10:52 -08:00
Volkan Keles	4167645d1e	GlobalISel: Move Localizer::shouldLocalize(..) to TargetLowering Add a new target hook for shouldLocalize so that targets can customize the logic. https://reviews.llvm.org/D75207	2020-03-02 09:15:40 -08:00
Simon Pilgrim	d20fb7ea13	Fix shadow variable warning. NFC.	2020-03-02 11:41:20 +00:00
Simon Pilgrim	e4380b07cc	Fix operator precedence warning. NFCI.	2020-03-02 10:56:58 +00:00
Serguei Katkov	496e0a99c7	[InlineSpiller] Relax re-materialization restriction for statepoint We should be careful to allow count of re-materialization of operands to be less then number of physical registers. STATEPOINT instruction has a variable number of operands and potentially very big. So re-materialization for all operands is disabled at the moment if restrict-statepoint-remat is true. The patch relaxes the re-materialization restriction for STATEPOINT instruction allowing it for fixed operands. Specifically it is about call target. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits, qcolombet, hiraditya Differential Revision: https://reviews.llvm.org/D75335	2020-03-02 11:25:44 +07:00
Craig Topper	0cd6712a7a	[DAGCombiner][X86] Disable narrowExtractedVectorLoad if the element type size isn't byte sized The address calculation for the offset assumes that you can calculate the offset by multiplying the index by the store size of the element. But that only works if the element's store size is exactly its real size since we store vectors tightly packed in memory. There are improvements we could make to this like special casing extracting element 0. I think we could also handle cases where the extracted VT is byte sized and the index is aligned with the extract element count. Differential Revision: https://reviews.llvm.org/D75377	2020-03-01 18:13:25 -08:00
Craig Topper	b6e2796114	[X86][TwoAddressInstructionPass] Teach tryInstructionCommute to continue checking for commutable FMA operands in more cases. Previously we would only check for another commutable operand if the first commute was an aggressive commute. But if we have two kill operands and neither is tied to the def at the start, we should consider both operands as the one to use as the new def. This improves the loop in the fma-commute-loop.ll test. This test is derived from a post from discourse here https://llvm.discourse.group/t/unnecessary-vmovapd-instructions-generated-can-you-hint-in-favor-of-vfmadd231pd/582 Differential Revision: https://reviews.llvm.org/D75016	2020-03-01 16:38:08 -08:00
Craig Topper	211fb91f10	[DAGCombiner] Don't emit select_cc from visitSINT_TO_FP/visitUINT_TO_FP. Use plain select instead. Select_cc isn't used by all targets. X86 doesn't have optimizations for it. Since we already know the input to the sint_to_fp/uint_to_fp is a setcc we can just emit a plain select using that setcc as the condition. Other DAG combines can turn that into a select_cc on targets that support it. Differential Revision: https://reviews.llvm.org/D75415	2020-03-01 10:52:17 -08:00
Sanjay Patel	619d7dc39a	[DAGCombiner] recognize shuffle (shuffle X, Mask0), Mask --> splat X We get the simple cases of this via demanded elements and other folds, but that doesn't work if the values have >1 use, so add a dedicated match for the pattern. We already have this transform in IR, but it doesn't help the motivating x86 tests (based on PR42024) because the shuffles don't exist until after legalization and other combines have happened. The AArch64 test shows a minimal IR example of the problem. Differential Revision: https://reviews.llvm.org/D75348	2020-03-01 09:10:25 -05:00
Simon Pilgrim	d955b221cb	[MachineInst] Remove dead code. NFCI. The MachineFunction MF value is not used any more and is always null.	2020-02-29 19:25:02 +00:00
Simon Pilgrim	6e7a768354	Make argument const to silence cppcheck warning. NFCI.	2020-02-29 19:25:01 +00:00
Fangrui Song	692e0c9648	[MC] Add MCStreamer::emitInt{8,16,32,64} Similar to AsmPrinter::emitInt{8,16,32,64}.	2020-02-29 09:40:21 -08:00
Vedant Kumar	dd1ea9de2e	Reland: [Coverage] Revise format to reduce binary size Try again with an up-to-date version of D69471 (`99317124` was a stale revision). --- Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 18:12:04 -08:00
Vedant Kumar	3388871714	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `99317124e1`. This is still busted on Windows: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873 The llvm-cov tests report 'error: Could not load coverage information'.	2020-02-28 18:03:15 -08:00
Vedant Kumar	99317124e1	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 17:33:25 -08:00
Vedant Kumar	0368b42295	[entry values] ARM: Add a describeLoadedValue override (PR45025) As a narrow stopgap for the assertion failure described in PR45025, add a describeLoadedValue override to ARMBaseInstrInfo and use it to detect copies in which the forwarding reg is a super/sub reg of the copy destination. For the moment this is unsupported. Several follow ups are possible: 1) Handle VORRq. At the moment, we do not, because isCopyInstrImpl returns early when !MI.isMoveReg(). 2) In the case where forwarding reg is a super-reg of the copy destination, we should be able to describe the forwarding reg as a subreg within the copy destination. I'm not 100% sure about this, but it looks like that's what's done in AArch64InstrInfo. 3) In the case where the forwarding reg is a sub-reg of the copy destination, maybe we could describe the forwarding reg using the copy destinaion and a DW_OP_LLVM_fragment (I guess this should be possible after D75036). https://bugs.llvm.org/show_bug.cgi?id=45025 rdar://59772698 Differential Revision: https://reviews.llvm.org/D75273	2020-02-28 14:30:40 -08:00
David Green	1de1070559	[DAGCombine] Fix alias analysis for unaligned accesses The alias analysis in DAG Combine looks at the BaseAlign, the Offset and the Size of two accesses, and determines if they are known to access different parts of memory by the fact that they are different offsets from inside that "alignment window". It does not seem to account for accesses that are not a multiple of the size, and may overflow from one alignment window into another. For example in the test case we have a 19byte memset that is splits into a 16 byte neon store and an unaligned 4 byte store with a 15 byte offset. This 15byte offset (with a base align of 8) wraps around to the next alignment windows. When compared to an access that is a 16byte offset (of the same 4byte size and 8byte basealign), the two accesses are said not to alias. I've fixed this here by just ensuring that the offsets are a multiple of the size, ensuring that they don't overlap by wrapping. Fixes PR45035, which was exposed by the UseAA changes in the arm backend. Differential Revision: https://reviews.llvm.org/D75238	2020-02-28 18:44:36 +00:00
Simon Pilgrim	4bc6f63320	[TargetLowering] SimplifyDemandedBits - fix SCALAR_TO_VECTOR knownbits bug We can only report the knownbits for a SCALAR_TO_VECTOR node if we only demand the 0'th element - the upper elements are undefined and shouldn't be trusted. This is causing a number of regressions that need addressing but we need to get the bugfix in first.	2020-02-28 15:23:37 +00:00
Jeremy Morse	6af859dcca	[DebugInfo] Re-implement LexicalScopes dominance method, add unit tests Way back in D24994, the combination of LexicalScopes::dominates and LiveDebugValues was identified as having worst-case quadratic complexity, but it wasn't triggered by any code path at the time. I've since run into a scenario where this occurs, in a very large basic block where large numbers of inlined DBG_VALUEs are present. The quadratic-ness comes from LiveDebugValues::join calling "dominates" on every variable location, and LexicalScopes::dominates potentially touching every instruction in a block to test for the presence of a scope. We have, however, already computed the presence of scopes in blocks, in the "InstrRanges" of each scope. This patch switches the dominates method to examine whether a block is present in a scope's InsnRanges, avoiding walking through the whole block. At the same time, fix getMachineBasicBlocks to account for the fact that InsnRanges can cover multiple blocks, and add some unit tests, as Lexical Scopes didn't have any. Differential revision: https://reviews.llvm.org/D73725	2020-02-28 11:41:28 +00:00
Sam Parker	bf61421a02	[RDA] Track implicit-defs Ensure that we're recording implicit defs, as well as visiting implicit uses and implicit defs when we're walking through operands. Differential Revision: https://reviews.llvm.org/D75185	2020-02-28 11:14:42 +00:00
serge-sans-paille	6d15c4deab	No longer generate calls to *_finite According to Joseph Myers, a libm maintainer > They were only ever an ABI (selected by use of -ffinite-math-only or > options implying it, which resulted in the headers using "asm" to redirect > calls to some libm functions), not an API. The change means that ABI has > turned into compat symbols (only available for existing binaries, not for > anything newly linked, not included in static libm at all, not included in > shared libm for future glibc ports such as RV32), so, yes, in any case > where tools generate direct calls to those functions (rather than just > following the "asm" annotations on function declarations in the headers), > they need to stop doing so. As a consequence, we should no longer assume these symbols are available on the target system. Still keep the TargetLibraryInfo for constant folding. Differential Revision: https://reviews.llvm.org/D74712	2020-02-28 10:07:37 +01:00
Vedant Kumar	a993720397	[LiveDebugValues] Encode register location within VarLoc IDs [3/3] This is part 3 of a 3-part series to address a compile-time explosion issue in LiveDebugValues. --- Start encoding register locations within VarLoc IDs, and take advantage of this encoding to speed up transferRegisterDef. There is no fundamental algorithmic change: this patch simply swaps out SparseBitVector in favor of CoalescingBitVector. That changes iteration order (hence the test updates), but otherwise this patch is NFCI. The only interesting change is in transferRegisterDef. Instead of doing: ``` KillSet = {} for (ID : OpenRanges.getVarLocs()) if (DeadRegs.count(ID)) KillSet.add(ID) ``` We now do: ``` KillSet = {} for (Reg : DeadRegs) for (ID : intervalsReservedForReg(Reg, OpenRanges.getVarLocs())) KillSet.add(ID) ``` By not visiting each open location every time we visit an instruction, this eliminates some potentially quadratic behavior. The new implementation basically does a constant amount of work per instruction because the interval map lookups are very fast. For a file in WebKit, this brings the time spent in LiveDebugValues down from ~2.5 minutes to 4 seconds, reducing compile time spent in that pass from 28% of the total to just over 1%. Before: ``` 2.49 min 27.8% 0 s LiveDebugValues::process 2.41 min 27.0% 5.40 s LiveDebugValues::transferRegisterDef 1.51 min 16.9% 1.51 min LiveDebugValues::VarLoc::isDescribedByReg() const 32.73 s 6.1% 8.70 s llvm::SparseBitVector<128u>::SparseBitVectorIterator::operator++() ``` After: ``` 4.53 s 1.1% 0 s LiveDebugValues::process 3.00 s 0.7% 107.00 ms LiveDebugValues::transferRegisterCopy 892.00 ms 0.2% 406.00 ms LiveDebugValues::transferSpillOrRestoreInst 404.00 ms 0.1% 32.00 ms LiveDebugValues::transferRegisterDef 110.00 ms 0.0% 2.00 ms LiveDebugValues::getUsedRegs 57.00 ms 0.0% 1.00 ms std::__1::vector<>::push_back 40.00 ms 0.0% 1.00 ms llvm::CoalescingBitVector<>::find(unsigned long long) ``` FWIW, I tried the same approach using SparseBitVector, but got bad results. To do that, I had to extend SparseBitVector to support 64-bit indices and expose its lower bound operation. The problem with this is that the performance is very hard to predict: SparseBitVector's lower bound operation falls back to O(n) linear scans in a std::list if you're not /very/ careful about managing iteration order. When I profiled this the performance looked worse than the baseline. You can see the full CoalescingBitVector-based implementation here: https://github.com/vedantk/llvm-project/commits/try-coalescing You can see the full SparseBitVector-based implementation here: https://github.com/vedantk/llvm-project/commits/try-sparsebitvec-find Depends on D74984 and D74985. Differential Revision: https://reviews.llvm.org/D74986	2020-02-27 12:39:47 -08:00
Vedant Kumar	210c4853de	[LiveDebugValues] Encode a location in VarLoc IDs, NFC [2/3] This is part 2 of a 3-part series to address a compile-time explosion issue in LiveDebugValues. --- Each VarLoc has a unique ID: this ID is used to look up a VarLoc in the VarLocMap, and to virtually insert a VarLoc into a VarLocSet. Instead of inserting the VarLoc /itself/ into the VarLocSet, we insert just the ID, because this can be represented efficiently with a SparseBitVector. This change introduces LocIndex, a layer of abstraction on top of VarLoc IDs. Prior to this change, an ID was just an index into a vector. With this change, an ID encodes both an index /and/ a register location. The type-checker ensures that conversions to and from LocIndex are correct. For the moment the register location is always 0 (undef). We have plenty of bits left over to encode physregs, stack slots, and other locations in the future. Differential Revision: https://reviews.llvm.org/D74985	2020-02-27 12:39:47 -08:00
Sanjay Patel	90fd859f51	[x86] use instruction-level fast-math-flags to drive MachineCombiner The code changes here are hopefully straightforward: 1. Use MachineInstruction flags to decide if FP ops can be reassociated (use both "reassoc" and "nsz" to be consistent with IR transforms; we probably don't need "nsz", but that's a safer interpretation of the FMF). 2. Check that both nodes allow reassociation to change instructions. This is a stronger requirement than we've usually implemented in IR/DAG, but this is needed to solve the motivating bug (see below), and it seems unlikely to impede optimization at this late stage. 3. Intersect/propagate MachineIR flags to enable further reassociation in MachineCombiner. We managed to make MachineCombiner flexible enough that no changes are needed to that pass itself. So this patch should only affect x86 (assuming no other targets have implemented the hooks using MachineIR flags yet). The motivating example in PR43609 is another case of fast-math transforms interacting badly with special FP ops created during lowering: https://bugs.llvm.org/show_bug.cgi?id=43609 The special fadd ops used for converting int to FP assume that they will not be altered, so those are created without FMF. However, the MachineCombiner pass was being enabled for FP ops using the global/function-level TargetOption for "UnsafeFPMath". We managed to run instruction/node-level FMF all the way down to MachineIR sometime in the last 1-2 years though, so we can do better now. The test diffs require some explanation: 1. llvm/test/CodeGen/X86/fmf-flags.ll - no target option for unsafe math was specified here, so MachineCombiner kicks in where it did not previously; to make it behave consistently, we need to specify a CPU schedule model, so use the default model, and there are no code diffs. 2. llvm/test/CodeGen/X86/machine-combiner.ll - replace the target option for unsafe math with the equivalent IR-level flags, and there are no code diffs; we can't remove the NaN/nsz options because those are still used to drive x86 fmin/fmax codegen (special SDAG opcodes). 3. llvm/test/CodeGen/X86/pow.ll - similar to #1 4. llvm/test/CodeGen/X86/sqrt-fastmath.ll - similar to #1, but MachineCombiner does some reassociation of the estimate sequence ops; presumably these are perf wins based on latency/throughput (and we get some reduction of move instructions too); I'm not sure how it affects numerical accuracy, but the test reflects reality better now because we would expect MachineCombiner to be enabled if the IR was generated via something like "-ffast-math" with clang. 5. llvm/test/CodeGen/X86/vec_int_to_fp.ll - this is the test added to model PR43609; the fadds are not reassociated now, so we should get the expected results. 6. llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll - similar to #1 7. llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll - similar to #1 Differential Revision: https://reviews.llvm.org/D74851	2020-02-27 15:19:37 -05:00
Djordje Todorovic	016d91ccbd	[CallSiteInfo] Handle bundles when updating call site info This will address the issue: P8198 and P8199 (from D73534). The methods was not handle bundles properly. Differential Revision: https://reviews.llvm.org/D74904	2020-02-27 13:57:06 +01:00
David Stenberg	6d857166d2	[DebugInfo] Describe call site values for chains of expression producing instrs Summary: If the describeLoadedValue() hook produced a DIExpression when describing a instruction, and it was not possible to emit a call site entry directly (the value operand was not an immediate nor a preserved register), then that described value could not be inserted into the worklist, and would instead be dropped, meaning that the parameter's call site value couldn't be described. This patch extends the worklist so that each entry has an DIExpression that is built up when iterating through the instructions. This allows us to describe instruction chains like this: $reg0 = mv $fp $reg0 = add $reg0, offset call @call_with_offseted_fp Since DW_OP_LLVM_entry_value operations can't be combined with any other expression, such call site entries will not be emitted. I have added a test, dbgcall-site-expr-entry-value.mir, which verifies that we don't assert or emit broken DWARF in such cases. Reviewers: djtodoro, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D75036	2020-02-27 11:18:51 +01:00
David Stenberg	ff574ff291	[DebugInfo][NFC] Move out lambdas from collectCallSiteParameters() Summary: This is a preparatory patch for D75036, in which a debug expression is associated with each parameter register in the worklist. In that patch the two lambda functions addToWorklist() and finishCallSiteParams() grow a bit, so move those out to separate functions. This patch also prepares for each parameter register having their own expression moving the creation of the DbgValueLoc into finishCallSiteParams(). Reviewers: djtodoro, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D75050	2020-02-27 11:18:51 +01:00
Matt Arsenault	6fc0d00823	GlobalISel: Fix lowering for G_UADDE/G_USUBE The type parameter passed into lower is invalid and should be removed from the function.	2020-02-26 19:10:52 -08:00
Matt Arsenault	c7e8d8b13e	GlobalISel: Cleanup code with MachineIRBuilder features	2020-02-26 19:10:34 -08:00
Krzysztof Parzyszek	fd7c2e24c1	[SDAG] Add SDNode::values() = make_range(values_begin(), values_end()) Also use it in a few places to simplify code a little bit. NFC	2020-02-26 12:07:38 -06:00
Sanjay Patel	b3d0c79836	[DAGCombiner] avoid narrowing fake fneg vector op This may inhibit vector narrowing in general, but there's already an inconsistency in the way that we deal with this pattern as shown by the test diff. We may want to add a dedicated function for narrowing fneg. It's often folded into some other op, so moving it away from other math ops may cause regressions that we would not see for normal binops. See D73978 for more details.	2020-02-26 11:25:56 -05:00
Simon Pilgrim	bbb0933e3d	[DAG] visitRotate - modulo non-uniform constant rotation amounts	2020-02-26 15:43:12 +00:00
Sam Parker	1d06e75df2	[ARM][RDA] add getUniqueReachingMIDef Add getUniqueReachingMIDef to RDA which performs a global search for a machine instruction that produces a unique definition of a given register at a given point. Also add two helper functions (getMIOperand) that wrap around this functionality to get the incoming definition uses of a given instruction. These now replace the uses of getReachingMIDef in ARMLowOverheadLoops. getReachingMIDef has been renamed to getReachingLocalMIDef and has been made private along with getInstFromId. Differential Revision: https://reviews.llvm.org/D74605	2020-02-26 11:15:26 +00:00
Fangrui Song	b61a4aaca5	[MC] Default MCContext::UseNamesOnTempLabels to false and only set it to true for MCAsmStreamer Only MCAsmStreamer (assembly output) needs to keep names of temporary labels created by MCContext::createTempSymbol(). This change made the rL236642 optimization available for cc2as and probably some other users. This eliminates a behavior difference between llvm-mc -filetype=obj and cc1as, which caused https://reviews.llvm.org/D74006#1890487 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75097	2020-02-25 18:23:10 -08:00
Craig Topper	735d27dc40	[SelectionDAG][PowerPC][AArch64][X86][ARM] Add chain input and output the ISD::FLT_ROUNDS_ This node reads the rounding control which means it needs to be ordered properly with operations that change the rounding control. So it needs to be chained to maintain order. This patch adds a chain input and output to the node and connects it to the chain in SelectionDAGBuilder. I've update all in-tree targets to connect their chain through their lowering code. Differential Revision: https://reviews.llvm.org/D75132	2020-02-25 16:58:23 -08:00
Quentin Colombet	5bf0023b0d	[GISel][KnownBits] Update a comment regarding the effect of cache on PHIs Unlike what I claimed in my previous commit. The caching is actually not NFC on PHIs. When we put a big enough max depth, we end up simulating loops. The cache is effectively cutting the simulation short and we get less information as a result. E.g., ``` v0 = G_CONSTANT i8 0xC0 jump v1 = G_PHI i8 v0, v2 v2 = G_LSHR i8 v1, 1 ``` Let say we want the known bits of v1. - With cache: Set v1 cache to we know nothing v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is in the cache => v1 is 0, thus v2 is 0x80 Finally v1 is v0 & v2 => 0x80 - Without cache and enough depth to do two iteration of the loop: v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is v0 & v2 v0 is 0xC0 v2 is v1 >> 1 Reach the max depth for v1... unwinding v1 is know nothing v2 is 0x80 v0 is 0xC0 v1 is 0x80 v2 is 0xC0 v0 is 0xC0 v1 is 0xC0 Thus now v1 is 0xC0 instead of 0x80. I've added a unittest demonstrating that. NFC	2020-02-25 15:56:15 -08:00
Scott Linder	915b4aa139	Support emitting .cfi_undefined in CodeGen This will be used by AMDGPU. Differential Revision: https://reviews.llvm.org/D74914	2020-02-25 14:00:01 -05:00
Quentin Colombet	a12f1d6a52	[MachineInstr] Add a dumpr method Add a dump method that recursively prints an instruction and all the instructions defining its operands and so on. This is helpful when looking at combiner issue. NFC Differential Revision: https://reviews.llvm.org/D75094	2020-02-25 10:46:29 -08:00
Roman Lebedev	d20907d1de	[Codegen] Revert rL354676/rL354677 and followups - introduced PR43446 miscompile This reverts https://reviews.llvm.org/D58468 (rL354676, `44037d7a63`), and all and any follow-ups to that code block. https://bugs.llvm.org/show_bug.cgi?id=43446	2020-02-25 20:30:12 +03:00
Jay Foad	ccee390767	GlobalISel: NFC minor cleanup to avoid a couple of fixed size local arrays	2020-02-25 09:49:19 +00:00
Roman Tereshin	b3bce6a3dd	[MachineVerifier] Doing ::calcRegsPassed over faster sets: ~15-20% faster MV, NFC MachineVerifier still takes 45-50% of total compile time with -verify-machineinstrs, with calcRegsPassed dataflow taking ~50-60% of MachineVerifier. The majority of that time is spent in BBInfo::addPassed, mostly within DenseSet implementing the sets the dataflow is operating over. In particular, 1/4 of that DenseSet time is spent just iterating over it (operator++), 40-50% on insertions, and most of the rest in ::count. Given that, we're implementing custom sets just for this analysis here, focusing on cheap insertions and O(n) iteration time (as opposed to O(U), where U is the universe). As it's based _mostly_ on BitVector for sparse and SmallVector for dense, it may remotely resemble SparseSet. The difference is, our solution is a lot less clever, doesn't have constant time `clear` that we won't use anyway as reusing these sets across analyses is cumbersome, and thus more space efficient and safer (got a resizable Universe and a fallback to DenseSet for sparse if it gets too big). With this patch MachineVerifier gets ~15-20% faster, its contribution to total compile time drops from 45-50% to ~35%, while contribution of calcRegsPassed to MachineVerifier drops from 50-60% to ~35% as well. calcRegsPassed itself gets another 2x faster here. All measured on a large suite of shaders targeting a number of GPUs. Reviewers: bogner, stoklund, rudkx, qcolombet Reviewed By: rudkx Tags: #llvm Differential Revision: https://reviews.llvm.org/D75033	2020-02-24 19:01:21 -08:00
Bill Wendling	23c2a5ce33	Allow "callbr" to return non-void values Summary: Terminators in LLVM aren't prohibited from returning values. This means that the "callbr" instruction, which is used for "asm goto", can support "asm goto with outputs." This patch removes all restrictions against "callbr" returning values. The heavy lifting is done by the code generator. The "INLINEASM_BR" instruction's a terminator, and the code generator doesn't allow non-terminator instructions after a terminator. In order to correctly model the feature, we need to copy outputs from "INLINEASM_BR" into virtual registers. Of course, those copies aren't terminators. To get around this issue, we split the block containing the "INLINEASM_BR" right before the "COPY" instructions. This results in two cheats: - Any physical registers defined by "INLINEASM_BR" need to be marked as live-in into the block with the "COPY" instructions. This violates an assumption that physical registers aren't marked as "live-in" until after register allocation. But it seems as if the live-in information only needs to be correct after register allocation. So we're able to get away with this. - The indirect branches from the "INLINEASM_BR" are moved to the "COPY" block. This is to satisfy PHI nodes. I've been told that MLIR can support this handily, but until we're able to use it, we'll have to stick with the above. Reviewers: jyknight, nickdesaulniers, hfinkel, MaskRay, lattner Reviewed By: nickdesaulniers, MaskRay, lattner Subscribers: rriddle, qcolombet, jdoerfert, MatzeB, echristo, MaskRay, xbolva00, aaron.ballman, cfe-commits, JonChesterfield, hiraditya, llvm-commits, rnk, craig.topper Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D69868	2020-02-24 18:29:06 -08:00
Matt Arsenault	11e3dde625	GlobalISel: Reimplement fewerElementsVectorBasic Changes the handling of odd breakdowns, and avoids using G_EXTRACT/G_INSERT. Pad with undef to a wider size, and unmerge. Also avoid introducing instructions for the fully undef components.	2020-02-24 21:19:47 -05:00
Craig Topper	a5fa778882	[LegalizeTypes] Scalarize non-byte sized loads in WidenRecRes_Load and SplitVecResLoad Should fix PR42803 and PR44902 Differential Revision: https://reviews.llvm.org/D74590	2020-02-24 15:14:33 -08:00
Roman Tereshin	6f87b162e6	[MachineVerifier] Doing ::calcRegsPassed in RPO: ~35% faster MV, NFC Depending on the target, test suite, pipeline config and perhaps other factors machine verifier when forced on with -verify-machineinstrs can increase compile time 2-2.5 times over (Release, Asserts On), taking up ~60% of the time. An invaluable tool, it significantly slows down machine verifier-enabled testing. Nearly 75% of its time MachineVerifier spends in the calcRegsPassed method. It's a classic forward dataflow analysis executed over sets, but visiting MBBs in arbitrary order. We switch that to RPO here. This speeds up MachineVerifier by about 35%, decreasing the overall compile time with -verify-machineinstrs by 20-25% or so. calcRegsPassed itself gets 2x faster here. All measured on a large suite of shaders targeting a number of GPUs. Reviewers: bogner, stoklund, rudkx, qcolombet Reviewed By: bogner Tags: #llvm Differential Revision: https://reviews.llvm.org/D75032	2020-02-24 13:30:01 -08:00
Simon Pilgrim	53b597cfa2	[SelectionDAG] Merge constant SDNode arithmetic into foldConstantArithmetic This is the second patch as part of https://bugs.llvm.org/show_bug.cgi?id=36544 Merging in the ConstantSDNode variant of FoldConstantArithmetic. After this, I will begin merging in FoldConstantVectorArithmetic I've ensured this patch can build & pass all lit tests in Windows and Linux environments. Patch by @justice_adams (Justice Adams) Differential Revision: https://reviews.llvm.org/D74881	2020-02-24 18:54:22 +00:00
Sjoerd Meijer	7efabe5c7d	[MIR][ARM] MachineOperand comments This adds infrastructure to print and parse MIR MachineOperand comments. The motivation for the ARM backend is to print condition code names instead of magic constants that are difficult to read (for human beings). For example, instead of this: dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14, $noreg t2Bcc %bb.4, 0, killed $cpsr we now print this: dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14 /* CC::always /, $noreg t2Bcc %bb.4, 0 / CC:eq /, killed $cpsr This shows that MachineOperand comments are enclosed between / and /. In this example, the EOR instruction is not conditionally executed (i.e. it is "always executed"), which is encoded by the 14 immediate machine operand. Thus, now this machine operand has / CC::always / as a comment. The 0 on the next conditional branch instruction represents the equal condition code, thus now this operand has / CC:eq */ as a comment. As it is a comment, the MI lexer/parser completely ignores it. The benefit is that this keeps the change in the lexer extremely minimal and no target specific parsing needs to be done. The changes on the MIPrinter side are also minimal, as there is only one target hooks that is used to create the machine operand comments. Differential Revision: https://reviews.llvm.org/D74306	2020-02-24 14:19:21 +00:00
Sam Parker	a67eb221e2	[RDA][ARM][LowOverheadLoops] Iteration count IT blocks Change the way that we remove the redundant iteration count code in the presence of IT blocks. collectLocalKilledOperands has been introduced to scan an instructions operands, collecting the killed instructions and then visiting them too. This is used to delete the code in the preheader which calculates the iteration count. We also track any IT blocks within the preheader and, if we remove all the instructions from the IT block, we also remove the IT instruction. isSafeToRemove is used to remove any redundant uses of the iteration count within the loop body. Differential Revision: https://reviews.llvm.org/D74975	2020-02-24 13:51:03 +00:00
Bevin Hansson	6e561d1c94	[Intrinsic] Add fixed point saturating division intrinsics. Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: ``` llvm.sdiv.fix.sat.* llvm.udiv.fix.sat.* ``` These intrinsics perform scaled, saturating division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Reviewers: bjope, leonardchan, craig.topper Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71550	2020-02-24 10:50:52 +01:00
Bevin Hansson	c3f36acc92	[MC] Widen the functional unit type from 32 to 64 bits. Summary: The type used to represent functional units in MC is 'unsigned', which is 32 bits wide. This is currently not a problem in any upstream target as no one seems to have hit the limit on this yet, but in our downstream one, we need to define more than 32 functional units. Increasing the size does not seem to cause a huge size increase in the binary (an llc debug build went from 1366497672 to 1366523984, a difference of 26k), so perhaps it would be acceptable to have this patch applied upstream as well. Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71210	2020-02-24 09:37:00 +01:00
Craig Topper	3a6bb32bd2	[SelectionDAG] Remove ISD::LIFETIME_START/LIFETIME_END from assert in getMemIntrinsicNode. These appear to have their own SDNode type and shouldn't use MemIntrinsicSDNode.	2020-02-23 22:32:36 -08:00
Florian Hahn	7769030b93	Recommit "[PatternMatch] Match XOR variant of unsigned-add overflow check." This version fixes a buildbot failure cause by picking the wrong insert point for XORs. We cannot pick the XOR binary operator as insert point, as it is not guaranteed that both input operands for the overflow intrinsic are defined before it. This reverts the revert commit `c7fc0e5da6`.	2020-02-23 18:33:18 +00:00
Sanjay Patel	a253a2a793	[SDAG] fold fsub -0.0, undef to undef rather than NaN A question about this behavior came up on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html ...and as part of backend improvements in D73978. We decided not to implement a more general change that would have folded any FP binop with nearly arbitrary constant + undef operand to undef because that is not theoretically correct (even if it is practically correct). This is the SDAG-equivalent to the IR change in D74713.	2020-02-23 11:36:53 -05:00
Quentin Colombet	b6d63c92ec	[GISel][KnownBits] Suppress unused warning on the dump method NFC	2020-02-21 21:07:04 -08:00
Quentin Colombet	618dec2aef	[GISel][KnownBits] Add a cache mechanism to speed compile time This patch adds a cache that is valid only for the duration of a call to getKnownBits. With such short lived cache we avoid all the problems of cache invalidation while still getting the benefits of reusing the information we already computed. This cache is useful whenever an instruction occurs more than once in a chain of computation. E.g., v0 = G_ADD v1, v2 v3 = G_ADD v0, v1 Previously we would compute the known bits for: v1, v2, v0, then v1 again and finally v3. With the patch, now we won't have to recompute v1 again. NFC	2020-02-21 14:31:42 -08:00
Francesco Petrogalli	31ec721516	[llvm][CodeGen] DAG Combiner folds for vscale. Summary: This patch simplifies the DAGs generated when using the intrinsic `@llvm.vscale.` as follows: Fold (add (vscale * C0), (vscale * C1)) to (vscale * (C0 + C1)). * Canonicalize (sub X, (vscale * C)) to (add X, (vscale * -C)). * Fold (mul (vscale * C0), C1) to (vscale * (C0 * C1)). * Fold (shl (vscale * C0), C1) to (vscale * (C0 << C1)). The test `sve-gep-ll` have been updated to reflect the folding introduced by this patch. Reviewers: efriedma, sdesmalen, andwar, rengolin Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74782	2020-02-21 18:03:12 +00:00
Hiroshi Yamauchi	0e3e242209	[BFI] Fix missed BFI updates in MachineSink. Summary: This prevents BFI queries on new blocks (from MachineSinking::GetAllSortedSuccessors) and fixes a bunch of assert failures under -check-bfi-unknown-block-queries=true. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74511	2020-02-21 09:50:54 -08:00
Nikita Popov	a8db806d52	[SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls This changes the SimplifyLibCalls utility to accept an IRBuilderBase, which allows us to pass through the IRBuilder used by InstCombine. This will ensure that new instructions get added to the worklist. The annotated test-case drops from 4 to 2 InstCombine iterations thanks to this. To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard, which is basically the same as the existing InsertPointGuard and FastMathFlagsGuard, but for operand bundles. Also add a setDefaultOperandBundles() method so these can be set outside the constructor. Differential Revision: https://reviews.llvm.org/D74792	2020-02-21 18:26:05 +01:00
Jay Foad	cab39e4b8c	GlobalISel: Fix narrowing of (G_ASHR i64:x, 32) Reviewers: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74950	2020-02-21 16:51:03 +00:00
Simon Pilgrim	42ec6fdce9	[TargetLowering] Apply basic shift combines before recursive SimplifyDemandedBits calls. Minor refactor/cleanup before we begin adding non-uniform support.	2020-02-21 16:31:20 +00:00
Simon Pilgrim	86c52af05a	[TargetLowering] SimplifyDemandedBits - use getValidShiftAmountConstant helper. Use the SelectionDAG::getValidShiftAmountConstant helper to get const/constsplat shift amounts, which allows us to drop the out of range shift amount early-out. First step towards better non-uniform shift amount support in SimplifyDemandedBits.	2020-02-21 14:23:53 +00:00
Sam Clegg	df74033ec9	[WebAssembly] Remove unneeded getWasmKindForNamedSection function I believe this was carried over from getELFKindForNamedSection since the wasm backend originally used ELF object writing as a template. Differential Revision: https://reviews.llvm.org/D74565	2020-02-20 22:49:08 -08:00
Eli Friedman	c767cf24e4	[SVE] Add support for lowering GEPs involving scalable vectors. This includes both GEPs where the indexed type is a scalable vector, and GEPs where the result type is a scalable vector. Differential Revision: https://reviews.llvm.org/D73602	2020-02-20 13:45:41 -08:00
Quentin Colombet	e4a9225f5d	[GISel][KnownBits] Give up on PHI analysis as soon as we don't know anything When analyzing PHIs, we gather the known bits for every operand and merge them together to get the known bits of the result of the PHI. It is not unusual that merging the information leads to know nothing on the result (e.g., phi a: i8 3, b: i8 unknown, ..., after looking at the second argument we know we will know nothing on the result), thus, as soon as we reach that state, stop analyzing the following operand (i.e., on the previous example, we won't process anything after looking at `b`). This improves compile time in particular with PHIs with a large number of operands. NFC.	2020-02-20 11:34:01 -08:00
Simon Pilgrim	f9c326364e	[DAGCombiner] Use SDValue::getConstantOperandAPInt helper where possible. NFC.	2020-02-20 18:23:05 +00:00
Simon Pilgrim	fc2b4a02b1	[DAGCombine] visitEXTRACT_VECTOR_ELT - add SimplifyDemandedBits multi use support Similar to what we already do with SimplifyDemandedVectorElts, call SimplifyDemandedBits across all the extracted elements of the source vector, treating it as single use. There's a minor regression in store-weird-sizes.ll which will be addressed in an upcoming SimplifyDemandedBits patch.	2020-02-20 15:49:38 +00:00
Sam Parker	659500c0c9	[NFC][RDA] Break-up initialization code Separate out the initialization code from the loop traversal so that the analysis can be reset and re-run by a user.	2020-02-20 14:59:42 +00:00
Djordje Todorovic	2f215cf36a	Revert "Reland "[DebugInfo] Enable the debug entry values feature by default"" This reverts commit rGfaff707db82d. A failure found on an ARM 2-stage buildbot. The investigation is needed.	2020-02-20 14:41:39 +01:00
Bill Wendling	129c911efa	Include static prof data when collecting loop BBs Summary: If the programmer adds static profile data to a branch---i.e. uses "__builtin_expect()" or similar---then we should honor it. Otherwise, "__builtin_expect()" is ignored in crucial situations. So we trust that the programmer knows what they're doing until proven wrong. Subscribers: hiraditya, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74809	2020-02-19 11:33:48 -08:00
Florian Hahn	c7fc0e5da6	Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check." This reverts commit `e01a3d49c2`. and commit `a6a585b803`. This causes a failure on GreenDragon: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597	2020-02-19 19:37:08 +01:00
Florian Hahn	e01a3d49c2	[PatternMatch] Match XOR variant of unsigned-add overflow check. Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match the expected pattern in CodeGenPerpare via UAddWithOverflow. This causes a regression over Clang 7 on both X86 and AArch64: https://gcc.godbolt.org/z/juhXYV This patch extends UAddWithOverflow to also catch the XOR case, if the XOR is only used in the ICMP. This covers just a single case, but I'd like to make sure I am not missing anything before tackling the other cases. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D74228	2020-02-19 15:25:18 +01:00
Florian Hahn	216afd3301	[TargetLower] Update shouldFormOverflowOp check if math is used. On some targets, like SPARC, forming overflow ops is only profitable if the math result is used: https://godbolt.org/z/DxSmdB This patch adds a new MathUsed parameter to allow the targets to make the decision and defaults to only allowing it if the math result is used. That is the conservative choice. This patch also updates AArch64ISelLowering, X86ISelLowering, ARMISelLowering.h, SystemZISelLowering.h to allow forming overflow ops if the math result is not used. On those targets using the overflow intrinsic for the overflow check only generates better code. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D74722	2020-02-19 11:28:33 +01:00
Djordje Todorovic	faff707db8	Reland "[DebugInfo] Enable the debug entry values feature by default" Differential Revision: https://reviews.llvm.org/D73534	2020-02-19 11:12:26 +01:00
Aditya Nandakumar	b91d9ec0bb	[GlobalISel]: Fix some non determinism exposed in CSE due to not notifying observers about mutations + add verification for CSE https://reviews.llvm.org/D67133 While investigating some non determinism (CSE doesn't produce wrong code, it just doesn't CSE some times) in GISel CSE on an out of tree target, I realized that the core issue was that there were lots of code that mutates (setReg, setRegClass etc), but doesn't notify observers (CSE in this case but this could be any other observer). In order to make the Observer be available in various parts of code and to avoid having to thread it through various API, the MachineFunction now has the observer as field. This allows it to be easily used in helper functions such as constrainOperandRegClass. Also added some invariant verification method in CSEInfo which can catch these issues (when CSE is enabled).	2020-02-18 14:54:57 -08:00
Thomas Lively	9d37f5afac	[WebAssembly] Implement multivalue call_indirects Summary: Unlike normal calls, call_indirects have immediate arguments that caused a MachineVerifier failure without a small tweak to loosen the verifier's requirements for variadicOpsAreDefs instructions. One nice thing about the new call_indirects is that they do not need to participate in the PCALL_INDIRECT mechanism because their post-isel hook handles moving the function pointer argument and adding the flags and typeindex arguments itself. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74191	2020-02-18 13:49:46 -08:00
Thomas Lively	7b64a59060	Reland "[WebAssembly][InstrEmitter] Foundation for multivalue call lowering" This reverts commit `649aba93a2`, now that the approach started there has been shown to be workable in the patch series culminating in https://reviews.llvm.org/D74192.	2020-02-18 13:49:46 -08:00
Simon Pilgrim	d6eef0614f	[TargetLowering] Add SimplifyMultipleUseDemandedBits 'all elements' helper wrapper. NFC.	2020-02-18 19:53:50 +00:00
Huihui Zhang	8ee0e1dc02	[NFC] Silence compiler warning [-Wmissing-braces].	2020-02-18 10:37:12 -08:00
Sander de Smalen	8fbc925807	Add OffsetIsScalable to getMemOperandWithOffset Summary: Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo, has the effect that all places where this information is used (notably, TargetInstrInfo::getMemOperandWithOffset) will need to consider Scale - and derived, Offset - possibly being scalable. This patch adds a new operand `bool &OffsetIsScalable` to TargetInstrInfo::getMemOperandWithOffset and fixes up all the places where this function is used, to consider the offset possibly being scalable. In most cases, this means bailing out because the algorithm does not (or cannot) support scalable offsets in places where it does some form of alias checking for example. Reviewers: rovka, efriedma, kristof.beyls Reviewed By: efriedma Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72758	2020-02-18 15:53:29 +00:00
Djordje Todorovic	2bf44d11cb	Revert "Reland "[DebugInfo] Enable the debug entry values feature by default"" This reverts commit rGa82d3e8a6e67.	2020-02-18 16:38:11 +01:00
Djordje Todorovic	a82d3e8a6e	Reland "[DebugInfo] Enable the debug entry values feature by default" This patch enables the debug entry values feature. - Remove the (CC1) experimental -femit-debug-entry-values option - Enable it for x86, arm and aarch64 targets - Resolve the test failures - Leave the llc experimental option for targets that do not support the CallSiteInfo yet Differential Revision: https://reviews.llvm.org/D73534	2020-02-18 14:41:08 +01:00
James Clarke	b3cd44f80b	Use SETNE directly rather than SUB/SETNE 0 for stack guard check Summary: Backends should fold the subtraction into the comparison, but not all seem to. Moreover, on targets where pointers are not integers, such as CHERI, an integer subtraction is not appropriate. Instead we should just compare the two pointers directly, as this should work everywhere and potentially generate more efficient code. Reviewers: bogner, lebedev.ri, efriedma, t.p.northover, uweigand, sunfish Reviewed By: lebedev.ri Subscribers: dschuff, sbc100, arichardson, jgravelle-google, hiraditya, aheejin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74454	2020-02-18 13:21:26 +00:00
Djordje Todorovic	a5ac8ca3e0	[CSInfo][TailDuplicator] Delete the call site info when removing dead MBBs This is needed for the debug entry values feature. Differential Revision: https://reviews.llvm.org/D74702	2020-02-18 12:29:51 +01:00
Jim Lin	466f8843f5	[NFC] Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h,td}	2020-02-18 10:49:13 +08:00
Vedant Kumar	3f148eabe0	[LiveDebugValues] Visit open var locs just once in transferRegisterDef, NFC For a file in WebKit, this brings the time spent in LiveDebugValues down from 16 minutes to 2 minutes. The reduction comes from iterating the set of open variable locations just once in transferRegisterDef. Post-patch, the most expensive item inside of transferRegisterDef is a call to VarLoc::isDescribedByReg, which we have to do. Testing: I built LNT using the Os-g cmake cache with & without this patch, then diffed the object files to verify there was no binary diff. rdar://59446577 Differential Revision: https://reviews.llvm.org/D74633	2020-02-17 14:04:22 -08:00
Matt Arsenault	0e2eb357e0	GlobalISel: Extend narrowing to G_ASHR	2020-02-17 10:42:59 -08:00
Matt Arsenault	8550859535	GlobalISel: Extend shift narrowing to G_SHL	2020-02-17 09:13:37 -08:00
Benjamin Kramer	564a9de28e	Hide implementation details. NFC>	2020-02-17 17:55:23 +01:00
Simon Pilgrim	a1585aec6f	[SelectionDAG] Expose the "getValidShiftAmount" helpers available. NFCI. These are going to be useful in TargetLowering::SimplifyDemandedBits, so expose these helpers outside of SelectionDAG.cpp Also add an getValidShiftAmountConstant early-out to getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant so we can use them for scalar cases as well.	2020-02-17 16:28:46 +00:00
Matt Arsenault	78d455adf0	GlobalISel: Add combine to narrow G_LSHR Produce an unmerge to a narrower type and introduce a narrower shift if needed. I wasn't sure if there was a better way to parameterize the target's preferred shift type for the GICombineRule, so manually call the combine helper.	2020-02-17 08:04:52 -08:00
Sander de Smalen	a7a96c726e	[AArch64] Implement passing SVE vectors by ref for AAPCS. Summary: This patch implements the part of the calling convention where SVE Vectors are passed by reference. This means the caller must allocate stack space for these objects and pass the address to the callee. Reviewers: efriedma, rovka, cameron.mcinally, c-rhodes, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71216	2020-02-17 15:20:28 +00:00
Sjoerd Meijer	dad5f00e3b	[DAGCombine] Combine pattern for REV16 This adds another pattern to the combiner for a case that we were not handling to generate the REV16 instruction for ARM/Thumb2 and a bswap+ror on X86. Differential Revision: https://reviews.llvm.org/D74032	2020-02-17 14:54:17 +00:00
Benjamin Kramer	5fc5c7db38	Strength reduce vectors into arrays. NFCI.	2020-02-17 15:37:35 +01:00
Fangrui Song	549b436beb	[MC] De-capitalize MCStreamer::Emit{Bundle,Addrsig}* etc So far, all non-COFF-related Emit* functions have been de-capitalized.	2020-02-15 09:11:48 -08:00
Simon Pilgrim	ce2b5f1569	Fix gcc9.2 -Winit-list-lifetime warning. NFCI. Reported by @lbenes (Luke Benes)	2020-02-15 16:48:51 +00:00
Fangrui Song	774971030d	[MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex}	2020-02-14 23:08:40 -08:00
Fangrui Song	895cad1a13	[AsmPrinter][XRay] Omit unique ID for xray_instr_map and xray_fn_idx Follow-up for D74006.	2020-02-14 21:10:46 -08:00
Diogo Sampaio	8bc790f9e6	[AArch64][FPenv] Update chain of int to fp conversion Summary: When using strict fp, it is required to update the chain when performing integer type promotion of a operand to a integer to floating point conversion. Reviewers: craig.topper, john.brawn Reviewed By: craig.topper Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74597	2020-02-15 05:07:34 +00:00
Fangrui Song	f554e27224	[AsmPrinter] Omit unique ID for __patchable_function_entries sections Follow-up for D74006. When the integrated assembler is used, we use SHF_LINK_ORDER. The linked-to symbol is part of ELFSectionKey, thus we can omit the unique ID.	2020-02-14 20:54:54 -08:00
Fangrui Song	1dc16c752d	[MC] Add MCSection::NonUniqueID and delete one MCContext::getELFSection overload	2020-02-14 20:25:52 -08:00
Fangrui Song	6d2d589b06	[MC] De-capitalize another set of MCStreamer::Emit* functions Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc	2020-02-14 19:26:52 -08:00
Fangrui Song	a55daa1461	[MC] De-capitalize some MCStreamer::Emit* functions	2020-02-14 19:11:53 -08:00
Matt Arsenault	3bb0ff8341	GlobalISel: Remove unused function argument	2020-02-14 15:57:39 -08:00
Sean Fertile	b75692c30e	[AsmPrinter] Use the McASMInfo to determine if we need descriptors. In https://reviews.llvm.org/rG8b737688c21a9755cae14cb9343930e0882164ab I switched the condition gating the creation of the descriptor symbol from checking the MCAsmInfo if we need to support descriptors, to if the OS was AIX. Technically the 2 should be interchangeable: if we are targeting AIX then we need to emit XCOFF object files, and the MCAsmInfo must return true for needing function descriptors. This doesn't account for lit test with runsteps that only set the arch. Eg: test/CodeGen/XCore/section-name.ll which when run natively on AIX we end up with a target xcore-ibm-aix and needFunctionDescriptors is false. This patch reverts to using the MCAsmInfo and adds an assert that the target OS must be AIX since that is the only target using the descriptor hook. Differential Revision: https://reviews.llvm.org/D74622	2020-02-14 15:20:39 -05:00
Matt Arsenault	bfbfa18591	GlobalISel: Lower s64->s16 G_FPTRUNC This is more or less directly ported from the AMDGPU custom lowering for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES instead of creating shift/trunc to extract the two halves, and zexting an inverted compare instead of select_cc). This also does not include the fast math expansion the DAG which converts to f32 and then to f16. I think that belongs in a pre-legalize combine instead.	2020-02-14 10:46:58 -08:00
Volkan Keles	187686a22f	[GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges Like COPY instructions explained in D70616, we don't check the constraints when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check if registers can be replaced, or a COPY instruction needs to be built. https://reviews.llvm.org/D70564	2020-02-14 10:45:58 -08:00
Alexandre Ganea	8404aeb56a	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Fangrui Song	bcd24b2d43	[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*	2020-02-13 22:08:55 -08:00
Fangrui Song	1d49eb00d9	[AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction Similar to rL328848.	2020-02-13 17:06:24 -08:00
Vedant Kumar	3091049446	Add dbgs() output to help track down missing DW_AT_location bugs, NFC	2020-02-13 14:38:44 -08:00
Vedant Kumar	8e77b33b3c	[Local] Do not move around dbg.declares during replaceDbgDeclare replaceDbgDeclare is used to update the descriptions of stack variables when they are moved (e.g. by ASan or SafeStack). A side effect of replaceDbgDeclare is that it moves dbg.declares around in the instruction stream (typically by hoisting them into the entry block). This behavior was introduced in llvm/r227544 to fix an assertion failure (llvm.org/PR22386), but no longer appears to be necessary. Hoisting a dbg.declare generally does not create problems. Usually, dbg.declare either describes an argument or an alloca in the entry block, and backends have special handling to emit locations for these. In optimized builds, LowerDbgDeclare places dbg.values in the right spots regardless of where the dbg.declare is. And no one uses replaceDbgDeclare to handle things like VLAs. However, there doesn't seem to be a positive case for moving dbg.declares around anymore, and this reordering can get in the way of understanding other bugs. I propose getting rid of it. Testing: stage2 RelWithDebInfo sanitized build, check-llvm rdar://59397340 Differential Revision: https://reviews.llvm.org/D74517	2020-02-13 14:35:02 -08:00
Fangrui Song	0bc77a0f0d	[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions Similar to rL328848.	2020-02-13 13:38:33 -08:00
Fangrui Song	0dce409cee	[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile	2020-02-13 13:22:49 -08:00
Matt Arsenault	de256478e6	GlobalISel: Don't use LLT references These should always be passed by value	2020-02-13 15:25:30 -05:00
Simon Pilgrim	32176133fa	Move FIXME to start of comment so visual studio actually tags it. NFC.	2020-02-13 14:28:50 +00:00
Serguei Katkov	a6f38b4697	[Statepoint] Remove redundant clear of call target on register Patchable statepoint is lowered into sequence of nops, so zeroed call target should not be on register. It is better to use getTargetConstant instead of getConstant to select zero constant for call target. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D74465	2020-02-13 10:25:50 +07:00
Fangrui Song	c662795b07	[AsmPrinter][ELF] Emit local alias for ExternalLinkage dso_local GlobalAlias	2020-02-12 17:08:22 -08:00
Guozhi Wei	369d086d78	[MBP] Partial tail duplication into hot predecessors Current tail duplication embedded in MBP duplicates a BB into all or none of its predecessors without too much cost analysis. So sometimes it is duplicated into cold predecessors, and in other cases it may miss the duplication into hot predecessors. This patch improves tail duplication in 3 aspects: A successor can be duplicated into part of its predecessors. A more fine-grained benefit analysis, combined with 1, now a successor is duplicated into hot predecessors only. If a successor can't be duplicated into one predecessor, it doesn't impact the duplication into other predecessors. Differential Revision: https://reviews.llvm.org/D73387	2020-02-12 15:22:33 -08:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Simon Pilgrim	9eb426c88c	[TargetLowering] Add NegatibleCost enum for isNegatibleForFree return codes The isNegatibleForFree/getNegatedExpression methods currently rely on a raw char value to indicate whether a negation is beneficial or not. This patch replaces the char return value with an NegatibleCost enum to more clearly demonstrate what is implied. It also renames isNegatibleForFree to getNegatibleCost to more accurately reflect whats going on. Differential Revision: https://reviews.llvm.org/D74221	2020-02-12 11:51:42 +00:00
Djordje Todorovic	97ed706a96	Revert "[DebugInfo] Enable the debug entry values feature by default" This reverts commit rG9f6ff07f8a39. Found a test failure on clang-with-thin-lto-ubuntu buildbot.	2020-02-12 11:59:04 +01:00
Clement Courbet	15488ff24b	[CodeGen] Fix the computation of the alignment of split stores. Summary: Right now the alignment of the lower half of a store is computed as align/2, which fails for unaligned stores (align = 1), and is overly pessimitic for, e.g. a 8 byte store aligned to 4 bytes. Fixes PR44851 Fixes PR44877 Reviewers: gchatelet, spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74311	2020-02-12 10:37:30 +01:00
Djordje Todorovic	9f6ff07f8a	[DebugInfo] Enable the debug entry values feature by default This patch enables the debug entry values feature. - Remove the (CC1) experimental -femit-debug-entry-values option - Enable it for x86, arm and aarch64 targets - Resolve the test failures - Leave the llc experimental option for targets that do not support the CallSiteInfo yet Differential Revision: https://reviews.llvm.org/D73534	2020-02-12 10:25:14 +01:00
Nicolai Hähnle	07a5b849f7	SelectionDAG: Fix bug in ClusterNeighboringLoads Summary: The method attempts to find loads that can be legally clustered by looking for loads consuming the same chain glue token. However, the old code looks at _all_ users of values produced by the chain node -- including uses of the loaded/returned value of volatile loads or atomics. This could lead to circular dependencies which then failed during scheduling. With this change, we filter out users by getResNo, i.e. by which SDValue value they use, to ensure that we only look at users of the chain glue token. This appears to be a rather old bug, which is perhaps surprising. However, the test case is actually quite fragile (i.e., it is hidden by fairly small changes), and the test _must_ use volatile loads for the bug to manifest. Reviewers: arsenm, bogner, craig.topper, foad Subscribers: MatzeB, jvesely, wdng, hiraditya, javed.absar, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74253	2020-02-12 09:12:55 +01:00
Craig Topper	0daf9b8e41	[X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses them to implement soft promotion for the half type. This is enough to provide basic support for __fp16 with strictfp. Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C is enabled.	2020-02-11 22:30:04 -08:00
lewis-revill	a6bd1256ce	[DebugInfo] Call site entries cannot be generated for FrameSetup calls Instructions marked as FrameSetup do not cause requestLabelAfterInsn to be called and so no such label is generated. Call instructions which require call site entries to be generated require this label to be present in order to calculate the return PC offset/address, but the check for whether the call instruction is marked as FrameSetup was not present. Therefore in the case where a call instruction is marked as FrameSetup, an assertion failure occurs if a call site entry is to be generated. This is the case with RISC-V's implementation of save/restore via library calls. Differential Revision: https://reviews.llvm.org/D71593	2020-02-11 21:23:18 +00:00
OCHyams	35e0ab647b	[DebugInfo][NFC] Fixup the UserValue methods to use FragmentInfo Fixup the UserValue methods to use FragmentInfo instead of DIExpression because the DIExpression is only ever used to get the to get the FragmentInfo. The DIExpression is meaningless in the UserValue class because each definition point added to a UserValue may have a unique DIExpression. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74057	2020-02-11 10:20:24 +00:00
OCHyams	3aa33fde03	[DebugInfo][NFC] Rename the class DbgValueLocation to DbgVariableValue Rename the class DbgValueLocation to DbgVariableValue and instances from Loc to DbgValue. These names better express the new semantics introduced in D74053. The class previously represented a { Location } only. It now represents a { Location, DIExpression } pair which together describe a value. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74055	2020-02-11 10:20:24 +00:00
OCHyams	1e40799324	[DebugInfo] Teach LDV how to handle identical variable fragments LiveDebugVariables uses interval maps to explicitly represent DBG_VALUE intervals. DBG_VALUEs are filtered into an interval map based on their { Variable, DIExpression }. The interval map will coalesce adjacent entries that use the same { Location }. Under this model, DBG_VALUEs which refer to the same bits of the same variable will be filtered into different interval maps if they have different DIExpressions which means the original intervals will not be properly preserved. This patch fixes the problem by using { Variable, Fragment } to filter the DBG_VALUEs into maps, and coalesces adjacent entries iff they have the same { Location, DIExpression } pair. The solution is not perfect because we see the similar issues appear when partially overlapping fragments are encountered, but is far simpler than a complete solution (i.e. D70121). Fixes: pr41992, pr43957 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74053	2020-02-11 10:20:24 +00:00
Amara Emerson	067dd9c6b1	[GlobalISel][CallLowering] Use stripPointerCasts(). A downstream test exposed a simple logic bug with the manual pointer stripping code, fix that by just using stripPointerCasts() on the value. I don't think there's a way to expose this issue upstream.	2020-02-10 15:43:57 -08:00
Matt Arsenault	f270da6bfc	RegisterCoalescer: Add LaneMask to debug printing	2020-02-10 12:34:33 -08:00
diggerlin	aa86311e62	[AIX][XCOFF] Support Mergeable2ByteCString and Mergeable4ByteCString SUMMARY: The patch is enable to support Mergeable2ByteCString and Mergeable4ByteCString Reviewers: daltenty Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D74164	2020-02-10 14:45:54 -05:00
Sebastian Neubauer	7cddd15e56	[SelectionDAG] Optimize build_vector of truncates and shifts Add a simplification to fuse a manual vector extract with shifts and truncate into a bitcast. Unpacking and packing values into vectors is only optimized with extractelement instructions, not when manually unpacked using shifts and truncates. This patch simplifies shifts and truncates into a bitcast if possible. Simplify (build_vec (trunc $1) (trunc (srl $1 width)) (trunc (srl $1 (2 * width))) ...) to (bitcast $1) Differential Revision: https://reviews.llvm.org/D73892	2020-02-10 15:04:07 +01:00
Djordje Todorovic	3a4dc577c9	[CSInfo] Fix the assertions regarding updating the CSInfo The call site info was not updated correctly when deleting corresponding call instructions. Differential Revision: https://reviews.llvm.org/D73700	2020-02-10 10:55:06 +01:00
Djordje Todorovic	68908993eb	[CSInfo] Use isCandidateForCallSiteEntry() when updating the CSInfo Use the isCandidateForCallSiteEntry(). This should mostly be an NFC, but there are some parts ensuring the moveCallSiteInfo() and copyCallSiteInfo() operate with call site entry candidates (both Src and Dest should be the call site entry candidates). Differential Revision: https://reviews.llvm.org/D74122	2020-02-10 10:03:14 +01:00
Amara Emerson	21c9d9ad43	[GlobalISel][CallLowering] Tighten constantexpr check for callee. I'm not sure there's a test case for this, but it's better to be safe.	2020-02-09 22:59:48 -08:00
Matt Arsenault	312a9d1b83	GlobalISel: Fix narrowScalar for G_{CTLZ\|CTTZ}_ZERO_UNDEF Narrow these for 64-bit VALU for AMDGPU.	2020-02-09 19:02:38 -05:00
Matt Arsenault	6135f5eda4	GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ The result type is separate from the source type.	2020-02-09 18:11:43 -05:00
Craig Topper	eeb63944e4	[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results. Remove code from LegalizeTypes that allowed this to work. We were already using BUILD_PAIR for this in some places so this standardizes on a single way to do this.	2020-02-08 09:52:31 -08:00
Craig Topper	2af1640f9a	[LegalizeDAG][X86][AMDGPU] Use ANY_EXTEND instead of ZERO_EXTEND when promoting ISD::CTTZ/CTTZ_ZERO_UNDEF. Summary: For CTTZ we place a set bit just past where the non-promoted type stopped so the extended bits won't be used for the count. For CTTZ_ZERO_UNDEF we don't care what happens if no bits are set in the original type and we end up counting into the extended bits. So we can just use ANY_EXTEND for both cases. This matches what is done in type legalization for these operations. We make no effort to force the upper bits to zero. Differential Revision: https://reviews.llvm.org/D74111	2020-02-07 22:25:56 -08:00
Amara Emerson	35c63d66aa	[GlobalISel][CallLowering] Look through bitcasts from constant function pointers. Calls to ObjC's objc_msgSend function are done by bitcasting the function global to the required function type signature. This patch looks through this bitcast so that we can do a direct call with bl on arm64 instead of using an indirect blr. Differential Revision: https://reviews.llvm.org/D74241	2020-02-07 15:32:54 -08:00
Vedant Kumar	0d0ef315cb	[MachineInstr] Add isCandidateForCallSiteEntry predicate Add the isCandidateForCallSiteEntry predicate to MachineInstr to determine whether a DWARF call site entry should be created for an instruction. For now, it's enough to have any call instruction that doesn't belong to a blacklisted set of opcodes. For these opcodes, a call site entry isn't meaningful. Differential Revision: https://reviews.llvm.org/D74159	2020-02-07 10:10:41 -08:00
Petar Avramovic	7df5fc9e03	[GlobalISel] Add buildMerge with SrcOp initializer list Allows more flexible use of buildMerge in places where use operands are available as SrcOp since it does not require explicit conversion to Register. Simplify code with new buildMerge. Differential Revision: https://reviews.llvm.org/D74223	2020-02-07 18:43:45 +01:00
Amara Emerson	28d22c2c9c	[GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber. This is a one off special case, since actually implementing full inline asm support will be much more involved. This lets us compile a lot more code as a common simple case. Differential Revision: https://reviews.llvm.org/D74201	2020-02-07 08:55:23 -08:00
Jinsong Ji	01edae1271	[AsmPrinter] Print FP constant in hexadecimal form instead Printing floating point number in decimal is inconvenient for humans. Verbose asm output will print out floating point values in comments, it helps. But in lots of cases, users still need additional work to covert the decimal back to hex or binary to check the bit patterns, especially when there are small precision difference. Hexadecimal form is one of the supported form in LLVM IR, and easier for debugging. This patch try to print all FP constant in hex form instead. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D73566	2020-02-07 16:00:55 +00:00
Matt Arsenault	3b198518ad	GlobalISel: Fix narrowing of G_CTPOP The result type is separate from the source type. Tests will be included in a future AMDGPU patch which uses this from RegBankSelect/applyMappingImpl.	2020-02-07 06:58:00 -08:00
Matt Arsenault	8de2dad9e0	GlobalISel: Fix lowering of G_CTLZ/G_CTTZ The type passed to lower was invalid, so I'm not sure how this was even working before. The source and destination type also do not have to match, so make sure to use the right ones.	2020-02-07 06:54:12 -08:00
Guillaume Chatelet	f85d3408e6	[NFC] Introduce an API for MemOp Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73964	2020-02-07 11:32:27 +01:00
Sourabh Singh Tomar	84e5760a16	[DebugInfo]: Reorderd the emission of debug_str section. Summary: This patch reorders the emission of debug_str section, so that string can come after macros. This is necessary for macro forms like DW_MACRO_define_strp, which emits macro as a string in debug_str section.	2020-02-07 11:15:55 +05:30
Amara Emerson	ac8a12c874	[GlobalISel] Use G_ZEXTLOAD instead of an anyextending load for non-pow-2 legalization. Fixes PR43288	2020-02-06 14:36:36 -08:00
Konstantin Schwarz	76986bdc46	[GlobalISel] Legalize more G_FP(EXT\|TRUNC) libcalls. This adds a new helper function for retrieving the floating point type corresponding to the specified bit-width.	2020-02-06 11:41:34 -08:00
Fangrui Song	727362e87b	[MC][ELF] Rename MC related "Associated" to "LinkedToSym" "linked-to section" is used by the ELF spec. By analogy, "linked-to symbol" is a good name for the signature symbol. The word "linked-to" implies a directed edge and makes it clear its relation with "sh_link", while one can argue that "associated" means an undirected edge. Also, combine tests and add precise SMLoc to improve diagnostics. Reviewed By: eugenis, grimar, jhenderson Differential Revision: https://reviews.llvm.org/D74082	2020-02-06 11:31:04 -08:00
Jeremy Morse	6531a78ac4	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field" This reverts commit `ed29dbaafa`. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in `e39e2b4a79` that's technically unrelated. dbg-addr-dse.ll conflicted because `41206b61e3` (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ece761427f	Revert "[DebugInfo][DAG] Distinguish different kinds of location indirection" This reverts commit `3137fe4d23`. I'm backing out D68945, which this patch is a follow up for. It'll be re-landed when D68945 is fixed. The changes to dbg-value-func-arg.ll occur because our handling of certain kinds of location now mixes up indirection that happens at different points in a DIExpression. While this is a regression, it's a return to the prior behaviour while a better patch is sought.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ed5998d21e	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `2d3174c4df`. The overall solution for this problem is reverting D68945, which wasn't handling the -O0 path through the codegen backend correctly. See: discussion in D73526.	2020-02-06 14:41:39 +00:00
Sjoerd Meijer	93b0536fd2	[RDA] getInstFromId: find instructions. NFC. To find the instruction in the block for a given ID, first a count and then a lookup was performed in the map, which is almost the same thing, thus doing double the work. Differential Revision: https://reviews.llvm.org/D73866	2020-02-06 14:13:31 +00:00
Sam Parker	0a8cae10fe	[ReachingDefs] Make isSafeToMove more strict. Test that we're not moving the instruction through instructions with side-effects. Differential Revision: https://reviews.llvm.org/D74058	2020-02-06 14:06:08 +00:00
Diogo Sampaio	8ba2b62810	[ARM] Fix non-determenistic behaviour Summary: ARM Type Promotion pass does not clear the container that defines if one variable was visited or not, missing optimization opportunities by luck when two llvm:Values from different functions are allocated at the same memory address. Also fixes a comment and uses existing method to pop and obtain last element of the worklist. Reviewers: samparker Reviewed By: samparker Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73970	2020-02-06 09:21:13 +00:00
Matt Arsenault	7464e8d6ad	GlobalISel: Remove check for illegal MIR The verifier will catch this.	2020-02-05 18:37:17 -05:00
Jonas Paulsson	96ea377ea4	[PHIElimination] Compile time optimization for huge functions. This is a compile-time optimization for PHIElimination (splitting of critical edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As discussed there, the way to remedy the slowdowns with huge functions is to pre-compute the live-in registers for each MBB in an efficient way in PHIElimination.cpp and then pass that information along to LiveVariabless::addNewBlock(). In all the huge test programs where this slowdown has been noticable, it has dissapeared entirely with this patch. Review: Björn Pettersson, Quentin Colombet. Differential Revision: https://reviews.llvm.org/D73152	2020-02-05 18:10:03 -05:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
Shu-Chun Weng	ce9633633c	[GlobalISel][AArch64] Fix contract cross-bank copies with SIMD instructions contractCrossBankCopyIntoStore() finds the instruction defines the source register and uses its output to replace the register. There are, however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES. Current implementation hardcodes to operand 0 and has no way of knowing which output should be used. This change adds another function to directly return the register that is the source of the register and use that for folding. This fixes https://bugs.llvm.org/show_bug.cgi?id=44783 Differential Revision: https://reviews.llvm.org/D74005	2020-02-05 10:38:35 -08:00
Simon Pilgrim	4592bb7195	visitINSERT_VECTOR_ELT - pull out repeated dyn_cast. NFCI. This always gets called at least once.	2020-02-05 13:30:54 +00:00
Djordje Todorovic	de90d73e03	[DebugInfo] Avoid the call site param for mem instrs with multiple defs We currently only handle mem instructions with a single define. Avoid the call site parameter debug info when we find the case with multiple defs, rather than throwing an assert. Differential Revision: https://reviews.llvm.org/D73954	2020-02-05 10:03:14 +01:00
Thomas Lively	649aba93a2	Revert "[WebAssembly][InstrEmitter] Foundation for multivalue call lowering" Summary: This reverts commit `3ef169e586`. The purpose of this commit was to allow stack machines to perform instruction selection for instructions with variadic defs. However, MachineInstrs fundamentally cannot support variadic defs right now, so this change does not turn out to be useful. Depends on D73927. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73928	2020-02-04 20:04:59 -08:00
David Blaikie	ec50e10db4	DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF Originally committed in: `1ced28cbe7` Reverted in: `f75301d16d` (reverted due to tests failing on non-linux/x86 targets, tests have since been generalized and specialized... since Split DWARF isn't supported on non-elf targets anyway and we have no way to run on "whatever elf target is available" so they fail on MacOS without an explicit target triple) This code was incorrectly emitting extra bytes into arbitrary parts of the object file when it was meant to be hashing them to compute the DWO ID. Follow-up patch(es) will refactor this API somewhat to make such bugs harder to introduce, hopefully.	2020-02-04 19:25:47 -08:00
Francis Visoiu Mistrih	7531a5039f	[Remarks] Extend the RemarkStreamer to support other emitters This extends the RemarkStreamer to allow for other emitters (e.g. frontends, SIL, etc.) to emit remarks through a common interface. See changes in llvm/docs/Remarks.rst for motivation and design choices. Differential Revision: https://reviews.llvm.org/D73676	2020-02-04 17:16:02 -08:00
Reid Kleckner	2d89e0a098	[SEH] Remove CATCHPAD SDNode and X86::EH_RESTORE MachineInstr The CATCHPAD node mostly existed to be selected into the EH_RESTORE instruction, which sets the frame back up when 32-bit Windows exceptions return to the parent function. However, creating this MachineInstr early increases the risk that other passes will come along and insert instructions that use the stack before ESP and EBP are restored. That happened in PR44697. Instead of representing these in the instruction stream early, delay it until PEI. Mark the blocks where this needs to happen as EHPads, but not funclet entry blocks. Passes after PEI have to be careful not to hoist instructions that can use stack across frame setup instructions, so this should be relatively reliable. Fixes PR44697 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D73752	2020-02-04 15:13:12 -08:00
Matt Arsenault	23b76096b7	CodeGenPrepare: Reorder check for cold and shouldOptimizeForSize shouldOptimizeForSize is showing up in a profile, spending around 10% of the pass time in one function. This should probably not be so slow, but the much cheaper attribute check should be done first anyway.	2020-02-04 11:23:13 -08:00
Matt Arsenault	de8451fe4d	GlobalISel: Fold SmallVector resizes into constructors	2020-02-04 10:28:08 -08:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Nico Weber	f75301d16d	Revert "DebugInfo: Check DW_OP_convert in loclists with Split DWARF" and follow-ups. This reverts commit `1ced28cbe7`. This reverts commit `4f281f0474`. This reverts commit `552a8fe12b`. The test fails on non-Linux.	2020-02-04 10:06:46 -05:00
Jeremy Morse	41206b61e3	[DebugInfo] Re-instate LiveDebugVariables scope trimming This patch reverts part of r362750 / D62650, which stopped LiveDebugVariables from trimming leading variable location ranges down to only covering those instructions that are in scope. I've observed some circumstances where the number of DBG_VALUEs in a function can be amplified in an un-necessary way, to cover more instructions that are out of scope, leading to very slow compile times. Trimming the range of instructions that the variables cover solves the slow compile times. The specific problem that r362750 tries to fix is addressed by the assignment to RStart that I've added. Any variable location that begins at the first instruction of a block will now be considered to begin at the start of the block. While these sound the same, the have different SlotIndexes, and the register allocator may shoehorn additional instructions in between the two. The test added in the past (wrong_debug_loc_after_regalloc.ll) still works with this modification. live-debug-variables.ll has a range trimmed to not cover the prologue of the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one instruction with no DebugLoc, which is expected behaviour. Differential Revision: https://reviews.llvm.org/D73691	2020-02-04 14:51:06 +00:00
Filipe Cabecinhas	abada5036e	[NFC] Fix some spelling mistakes to test pushing to GH.	2020-02-04 11:07:31 +00:00
Simon Pilgrim	3dd688a9ee	[DAG] OptLevelChanger - fix uninitialized variable analyzer warning (PR44471) Ensure that OptLevelChanger::SavedFastISel is initialized in the constructor. This should be NFC - as the equivalent 'same opt level' early-out is used in the destructor as well, so SavedFastISel is only actually referenced in the general case. Differential Revision: https://reviews.llvm.org/D73875	2020-02-04 10:54:33 +00:00
David Green	362d00e051	[ARM][VecReduce] Force expand vector_reduce_fmin Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code.	2020-02-04 09:36:59 +00:00
Guillaume Chatelet	b8144c0536	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
David Blaikie	1ced28cbe7	DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF This code was incorrectly emitting extra bytes into arbitrary parts of the object file when it was meant to be hashing them to compute the DWO ID. Follow-up patch(es) will refactor this API somewhat to make such bugs harder to introduce, hopefully.	2020-02-03 19:16:42 -08:00
David Blaikie	031f83fb82	DebugInfo: Simplify emitDebugLocEntry by never passing a null CU	2020-02-03 18:47:14 -08:00
Matt Arsenault	cd7650c186	GlobalISel: Implement fewerElementsVector for G_SEXT_INREG Start using a new strategy with a combination of merge and unmerges. This allows scalarizing before lowering, which in cases like <2 x s128> avoids producing giant illegal shifts.	2020-02-03 11:47:33 -08:00
Quentin Colombet	f26ff8c9df	[TargetRegisterInfo] Make the heuristic to skip region split overridable by the target RegAllocGreedy uses a fairly compile time intensive splitting heuristic called region splitting. This heuristic was disabled via another heuristic when it is likely that it won't be worth the compile time. The only way to control this other heuristic was via a command line option (huge-size-for-split). This commit gives more control on this heuristic by making it overridable by the target using a target hook in TargetRegisterInfo called shouldRegionSplitForVirtReg. The default implementation of this hook keeps the heuristic as it was before this patch.	2020-02-03 11:30:35 -08:00
Simon Pilgrim	61621f826a	[TargetLowering] SimplifyDemandedBits - add basic KnownBits ZEXTLoad handling We have to be careful in SimplifyDemandedBits with loads in case we attempt to combine back to a constant (which then gets turned into a constant pool load again), but we can at least set the upper KnownBits for a ZEXTLoad to zero.	2020-02-03 16:50:04 +00:00
Guillaume Chatelet	333f2ad8b8	[Alignment][NFC] Use Align for getMemcpy/Memmove/Memset Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73885	2020-02-03 17:13:19 +01:00
Guillaume Chatelet	fc19465965	[Alignment][NFC] Use Align for code creating MemOp Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73874	2020-02-03 14:10:30 +01:00
Guillaume Chatelet	75d9994a51	Fix broken invariant Summary: A Copy with a source that is zeros is the same as a Set of zeros. This fixes the invariant that SrcAlign should always be non-null. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73791	2020-02-03 11:01:05 +01:00
Fangrui Song	44cdae68c3	[CodeGenPrepare] Delete dead !DL check Follow-up for D73754 DL is assigned in CodeGenPrepare::runOnFunction and is guaranteed to be non-null.	2020-02-02 09:49:06 -08:00
Fangrui Song	5a56a25b0b	[CodeGenPrepare] Make TargetPassConfig required The code paths in the absence of TargetMachine, TargetLowering or TargetRegisterInfo are poorly tested. As rL285987 said, requiring TargetPassConfig allows us to delete many (untested) checks littered everywhere. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73754	2020-02-02 09:28:45 -08:00
Fangrui Song	5932f7b8f2	[PatchableFunction] Use an empty DebugLoc The current FirstMI.getDebugLoc() is actually null in almost all cases. If it isn't, the generated .loc will be considered initial. The .loc will have the prologue_end flag and terminate the prologue prematurely. Also use an overload of BuildMI that will not prepend PATCHABLE_FUNCTION_ENTRY to a MachineInstr bundle.	2020-02-01 14:12:06 -08:00
Craig Topper	943b5561d6	[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops. This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes. It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from. This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle. Differential Revision: https://reviews.llvm.org/D73749	2020-02-01 11:21:04 -08:00
Matt Arsenault	bc101ffd77	GlobalISel: Support widening unmerge results with pointer source	2020-02-01 10:47:03 -05:00
David Blaikie	338beff4dc	DwarfDebug.cpp: Fix some indentation	2020-01-31 16:01:57 -08:00
David Blaikie	b33e5f3c3e	DebugInfo: Split DWARF: Hash non-member function child DIEs Significant missing hashing - as per the comment this was only meant to skip member functions (unspecified, but I think it's legible as member function declarations, not definitions) but was skipping all named subprograms (so only hashed child DIEs for member function definitions - because they didn't have a direct name, but only a name given indirectly in the DW_AT_specification-referenced DIE)	2020-01-31 15:32:03 -08:00
Matt Arsenault	792d9b5719	DAG: Check if a value is divergent before requiresUniformRegister This avoids a potentially expensive scan if we already know it doesn't matter.	2020-01-31 15:27:18 -08:00
Jay Foad	f465b1aff4	[GlobalISel] Tweak lowering of G_SMULO/G_UMULO Summary: Applying this cleanup: - MIRBuilder.buildInstr(TargetOpcode::G_ASHR) - .addDef(Shifted) - .addUse(Res) - .addUse(ShiftAmt); + MIRBuilder.buildAShr(Shifted, Res, ShiftAmt); caused an assertion failure here: llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:404: llvm::MachineInstr *llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() \|\| std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed. #4 0x00000000050a6d96 in llvm::MachineRegisterInfo::getVRegDef (this=0x74606a0, Reg=2147483650) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:403 #5 0x00000000066148f6 in llvm::getConstantVRegValWithLookThrough (VReg=2147483650, MRI=..., LookThroughInstrs=false, HandleFConstant=true) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:244 #6 0x00000000066147da in llvm::getConstantVRegVal (VReg=2147483650, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:210 #7 0x0000000006615367 in llvm::ConstantFoldBinOp (Opcode=101, Op1=2147483650, Op2=2147483656, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:341 #8 0x000000000657eee0 in llvm::CSEMIRBuilder::buildInstr (this=0x7465010, Opc=101, DstOps=..., SrcOps=..., Flag=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp:160 #9 0x0000000003645958 in llvm::MachineIRBuilder::buildAShr (this=0x7465010, Dst=..., Src0=..., Src1=..., Flags=...) at /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1298 #10 0x00000000065c35b1 in llvm::LegalizerHelper::lower (this=0x7fffffffb5f8, MI=..., TypeIdx=0, Ty=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2020 because at this point there are two instructions defining Res: the original G_SMULO/G_UMULO and the new G_MUL that we built. The fix is to modify the original mul in place, so that there is only ever one definition of Res. Reviewers: arsenm, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72842	2020-01-31 19:21:01 +00:00
Simon Pilgrim	8fbc7fd567	[DAG] SimplifyMultipleUseDemandedBits - peek through unused ISD::INSERT_SUBVECTOR subvectors If we don't demand any elements of the inserted subvector then just skip it.	2020-01-31 18:57:22 +00:00
Simon Pilgrim	5702dadf6f	[DAG] Enable ISD::INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::INSERT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-31 18:02:34 +00:00
Hiroshi Yamauchi	ac8da31a0f	[PGO][PGSO] Handle MBFIWrapper Some code gen passes use MBFIWrapper to keep track of the frequency of new blocks. This was not taken into account and could lead to incorrect frequencies as MBFI silently returns zero frequency for unknown/new blocks. Add a variant for MBFIWrapper in the PGSO query interface. Depends on D73494.	2020-01-31 09:36:55 -08:00
Jay Foad	2a1b5af299	[GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister Summary: As a side effect some redundant copies of constant values are removed by CSEMIRBuilder. Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73789	2020-01-31 17:07:16 +00:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Quentin Colombet	cfebd77742	[GISel][KnownBits] Fix a bug where we could run out of stack space One of the exit criteria of computeKnownBits is whether we reach the max recursive call depth. Before this patch we would check that the depth is exactly equal to max depth to exit. Depth may get bigger than max depth if it gets passed to a different GISelKnownBits object. This may happen when say a generic part uses a GISelKnownBits object with some max depth, but then we hit TL.computeKnownBitsForTargetInstr which creates a new GISelKnownBits object with a different and smaller depth. In that situation, when we hit the max depth check for the first time in the target specific GISelKnownBits object, depth may already be bigger than the current max depth. Hence we would continue to compute the known bits, until we ran through the full depth of the chain of computation or ran out of stack space. For instance, let say we have GISelKnownBits Info(/MaxDepth/ = 10); Info.getKnownBits(Foo) // 9 recursive calls to computeKnownBitsImpl. // Then we hit a target specific instruction. // The target specific GISelKnownBits does this: GISelKnownBits TargetSpecificInfo(/MaxDepth/ = 6) TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would // always return false. This commit does not have any test case, none of the in-tree targets use computeKnownBitsForTargetInstr.	2020-01-30 19:30:39 -08:00
Leonard Chan	2d3174c4df	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. This is an attempt to reland with a few fixes for buildbot since I haven't merged from master in a bit. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 17:09:42 -08:00
Amara Emerson	84bd851108	[GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required. We can have geps that have a scalar base pointer, and a vector index value, which means that the base pointer must be splatted into a vector of pointers. This fixes crashes on arm64 GlobalISel with optimizations enabled.	2020-01-30 16:27:27 -08:00
Leonard Chan	3b23453b6c	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `fff6a1b0f1`. This was breaking a bunch of buildbots.	2020-01-30 16:18:41 -08:00
Leonard Chan	fff6a1b0f1	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 15:58:37 -08:00
Matt Arsenault	eb7f74e300	CodeGen: Use Register	2020-01-30 15:01:56 -08:00
Sean Fertile	8b737688c2	[AIX] Minor cleanup in AsmPrinter. [NFC] - Extends the comments related to function descriptors, noting how they are only used on AIX. - Changes the condition used to gate the creation of the current function symbol in AsmPrinter::SetupMachineFunction to reflect being AIX specific. The creation of the symbol is different because of AIXs linkage conventions, not because AIX uses function descriptors. Differential Revision: https://reviews.llvm.org/D73115	2020-01-30 14:15:02 -05:00
Fangrui Song	06b8e32d4f	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after `9a24488cb6`, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680	2020-01-30 11:11:52 -08:00
Matt Arsenault	ea956685a1	GlobalISel: Implement s32->s64 G_FPTOSI lowering Port directly from DAG version. The lowering for G_FPTOUI used to fail on AMDGPU because it uses G_FPTOSI.	2020-01-30 08:47:07 -05:00
Dominik Montada	dc141af755	[GlobalISel] (fix) Use pointer type size for offset constant when lowering stores Commit `9965b12fd1` was supposed to change the offset constant when lowering load/stores, but only introduced this change for loads. This patch adds the same fix for stores.	2020-01-30 08:32:35 -05:00
Simon Pilgrim	57b0d33224	[DAGCombiner] ISD::AND/OR/XOR - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:53 +00:00
Simon Pilgrim	a967aa2706	[DAGCombiner] ISD::SDIV/UDIV/SREM/UREM - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:52 +00:00
Matt Arsenault	c5fffa4da3	GlobalISel: Add observer argument to legalizeIntrinsic This is passed to legalizeCustom, but not intrinsic. Also remove the MRI argument, since you can get that from the MachineIRBuilder. I'm not sure why MachineIRBuilder has a private observer member, and this is passed separately.	2020-01-29 18:33:45 -05:00
Amara Emerson	c12f046eb9	[GlobalISel] Add new combine to convert scalar G_MUL to G_SHL. For pow2 constants we should use G_SHL for pattern matching (and perf) purposes later. Vector support not yet implemented. Differential Revision: https://reviews.llvm.org/D73659	2020-01-29 13:39:00 -08:00
Amara Emerson	0da937bb5c	[GlobalISel][IRTranslator] Follow convention and put constant offset of getelementptr arithmetic on RHS. We were needlessly putting known constant values on the LHS of a G_MUL, which is suboptimal. Differential Revision: https://reviews.llvm.org/D73650	2020-01-29 11:37:19 -08:00
Fangrui Song	8903e61b66	[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects For `MC_GlobalAddress` operands referencing certain GlobalObjects, we can lower them to STB_LOCAL aliases to avoid costs brought by assembler/linker's conservative decisions about symbol interposition: * An assembler conservatively assumes a global default visibility symbol interposable (ELF semantics). So relocations in object files are needed even if the code generator assumed the definition exact and non-interposable. * The relocations can cause the creation of PLT entries on some targets for -shared links. A linker conservatively assumes a global default visibility symbol interposable (if not otherwise constrained by -Bsymbolic/--dynamic-list/VER_NDX_LOCAL/etc). "certain" refers to GlobalObjects in the intersection of `hasExactDefinition() and !isInterposable()`: `external`, `appending`, `internal`, `private`. Local linkages (`internal` and `private`) cannot be interposed. `appending` is for very few objects LLVM interpret specially. So the set just includes `external`. This patch emits STB_LOCAL aliases (.Lfoo$local) for such GlobalObjects, so that targets can lower MC_GlobalAddress operands to STB_LOCAL aliases if applicable. We may extend the scope and include GlobalAlias in the future. LLVM's existing -fno-semantic-interposition behaviors give us license to do such optimizations: * Various optimizations (ipconstprop, inliner, sccp, sroa, etc) treat normal ExternalLinkage GlobalObjects as non-interposable. * Before D72197, MC resolved a PC-relative VK_None fixup to a non-local symbol at assembly time (no outstanding relocation), if the target is defined in the same section. Put it simply, even if IR optimizations failed to optimize and allowed interposition for the function call in `void foo() {} void bar() { foo(); }`, the assembler would disallow it. This patch sets up AsmPrinter infrastructure to make -fno-semantic-interposition more so. With and without the patch, the object file output should be identical: `.Lfoo$local` does not take a symbol table entry. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D73228	2020-01-29 10:58:43 -08:00
Simon Pilgrim	f7245ef897	[DAGCombiner] ISD::SHL/SRA/SRL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 18:49:42 +00:00
Adrian Prantl	18dbe1b279	Run clang-format on DwarfExpression (NFC)	2020-01-29 10:23:12 -08:00
Adrian Prantl	816ee8a423	DwarfExpression: Factor out getOrCreateBaseType() (NFC)	2020-01-29 10:23:12 -08:00
Simon Pilgrim	25b8e96388	[DAGCombiner] ISD::MUL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 17:26:22 +00:00
Simon Pilgrim	4b04e11735	[DAGCombiner] Sub/SUBSAT - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 16:57:13 +00:00
Simon Pilgrim	48bd6a0986	[DAGCombiner] visitIMINMAX - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us instead of the ConstantSDNode variant where we have to do it ourselves.	2020-01-29 16:57:13 +00:00
Matt Arsenault	b63629a58d	GlobalISel: Fix mask computation in lowerInsert This is supposed to be the high bit index, not the width. Use the wrapping form of getBitsSet and avoid the bitflip.	2020-01-29 08:25:36 -08:00
Jay Foad	0d7bd34312	[MachineScheduler] Ignore artificial edges when forming store chains Summary: BaseMemOpClusterMutation::apply forms store chains by looking for control (i.e. non-data) dependencies from one mem op to another. In the test case, clusterNeighboringMemOps successfully clusters the loads, and then adds artificial edges to the loads' successors as described in the comment: // Copy successor edges from SUa to SUb. Interleaving computation // dependent on SUa can prevent load combining due to register reuse. The effect of this is that data dependencies from one load to a store are copied as artificial dependencies from a different load to the same store. Then when BaseMemOpClusterMutation::apply looks at the stores, it finds that some of them have a control dependency on a previous load, which breaks the chains and means that the stores are not all considered part of the same chain and won't all be clustered. The fix is to only consider non-artificial control dependencies when forming chains. Subscribers: MatzeB, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71717	2020-01-29 16:23:01 +00:00
Matt Arsenault	f717483acd	GlobalISel: Assert on invalid bitcast in MIRBuilder The other casts validate, so this should too.	2020-01-29 07:49:39 -08:00
Matt Arsenault	c5c1bb3374	GlobalISel: Lower G_WRITE_REGISTER	2020-01-29 06:48:24 -08:00
David Stenberg	6a2413c435	[ARM64] Debug info for structure argument missing DW_AT_location Summary: Prevent eliminating dbg_val due to COPY. Fixes this https://bugs.llvm.org/show_bug.cgi?id=40709 Patch by: Kamlesh Kumar (kamleshbhalui) Reviewers: aprantl, dblaikie, vsk, dsanders Reviewed By: dsanders Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73159	2020-01-29 10:56:23 +01:00
Sam Parker	ac30ea2f87	[RDA][ARM] Move functionality into RDA Add several new helpers to RDA: - hasLocalDefBefore - isRegDefinedAfter - isSafeToDefRegAt And move two bits of logic from ARMLowOverheadLoops into RDA: - isSafeToMove - isSafeToRemove Both of these have some wrappers too to make them more convienent to use. Differential Revision: https://reviews.llvm.org/D73460	2020-01-29 03:27:47 -05:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Michael Spang	a2fb2c0ddc	[GlobalMerge] Preserve symbol visibility when merging globals Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235	2020-01-28 13:26:18 -08:00
Hiroshi Yamauchi	2c03c899d5	[MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC. Summary: To avoid header file circular dependency issues in passing updated MBFI (in MBFIWrapper) to the interface of profile guided size optimizations. A prep step for (and split off of) D73381. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73494	2020-01-28 10:58:46 -08:00
Sam Parker	7ad879caa0	[NFC][RDA] typedef SmallPtrSetImpl<MachineInstr*>	2020-01-28 13:15:44 +00:00
Wang, Pengfei	3239b5034e	[FPEnv] Add pragma FP_CONTRACT support under strict FP. Summary: Support pragma FP_CONTRACT under strict FP. Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3 Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72820	2020-01-28 20:43:43 +08:00
Guillaume Chatelet	879c825cb8	[instrinsics] Add @llvm.memcpy.inline instrinsics Summary: This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++. A follow up CL will add the intrinsics in Clang. Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71710	2020-01-28 09:42:01 +01:00
Fangrui Song	c7c5da6df3	Reland "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"" Reland `7a8b0b1595`, with a fix that checks `!E.value().empty()` to avoid inserting a zero to SlotRemap. Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73510	2020-01-27 15:58:49 -08:00
Jay Foad	cbbbd5b5f6	[GlobalISel] Make use of KnownBits::computeForAddSub Summary: This is mostly NFC. computeForAddSub may give more precise results in some cases, but that doesn't seem to affect any existing GlobalISel tests. Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73431	2020-01-27 22:22:56 +00:00
Simon Pilgrim	e7e043724e	[DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow. Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-27 21:17:47 +00:00
Adrian Prantl	a095d149c2	Fix an assertion failure in DwarfExpression's subregister composition This patch fixes an assertion failure in DwarfExpression that is triggered when a complex fragment has exactly the size of a subregister of the register the DBG_VALUE points to and there is no DWARF encoding for the super-register. I took the opportunity to replace/document some magic values with static constructor functions to make this code less confusing to read. rdar://problem/58489125 Differential Revision: https://reviews.llvm.org/D72938	2020-01-27 12:44:37 -08:00
Vedant Kumar	e08f205f5c	Reland (again): [DWARF] Allow cross-CU references of subprogram definitions This is a revert-of-revert (i.e. this reverts commit `802bec89`, which itself reverted `fa4701e1` and `79daafc9`) with a fix folded in. The problem was that call site tags weren't emitted properly when LTO was enabled along with split-dwarf. This required a minor fix. I've added a reduced test case in test/DebugInfo/X86/fission-call-site.ll. Original commit message: This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. Update #1: Reland with a fix to create a declaration DIE when the declaration is missing from the CU's retainedTypes list. The declaration is left out of the retainedTypes list in two cases: 1) Re-compiling pre-r266445 bitcode (in which declarations weren't added to the retainedTypes list), and 2) Doing LTO function importing (which doesn't update the retainedTypes list). It's possible to handle (1) and (2) by modifying the retainedTypes list (in AutoUpgrade, or in the LTO importing logic resp.), but I don't see an advantage to doing it this way, as it would cause more DWARF to be emitted compared to creating the declaration DIEs lazily. Update #2: Fold in a fix for call site tag emission in the split-dwarf + LTO case. Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a ReleaseLTO-g build of the test suite. rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440 Differential Revision: https://reviews.llvm.org/D70350	2020-01-27 10:52:34 -08:00
Nico Weber	68051c1224	Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()" This reverts commit `7a8b0b1595`. It seems to break exception handling on 32-bit Windows, see https://crbug.com/1045650	2020-01-27 11:22:33 -05:00
Dominik Montada	9965b12fd1	Use pointer type size for offset constant when lowering load/stores	2020-01-27 06:55:32 -08:00
Matt Arsenault	2a160ba5b0	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
Matt Arsenault	06d9230fef	GlobalISel: Translate vector GEPs	2020-01-27 05:35:05 -08:00
Igor Kudrin	8f3d47c54a	[DWARF] Do not pass Version to DWARFExpression. NFCI. The Version was used only to determine the size of an operand of DW_OP_call_ref. The size was 4 for all versions apart from 2, but the DW_OP_call_ref operation was introduced only in DWARF3. Thus, the code may be simplified and using of Version may be eliminated. Differential Revision: https://reviews.llvm.org/D73264	2020-01-27 19:08:46 +07:00
David Stenberg	13d4ef9ac0	Improvements to call site register worklist Summary: This fixes PR44118. For cases where we have a chain like this: R8 = R1 (entry value) R0 = R8 call @foo R0 the code that emits call site entries using entry values would not follow that chain, instead emitting a call site entry with R8 as location rather than R0. Such a case was discovered when originally adding dbgcall-site-orr-moves.mir. This patch fixes that issue. This is done by changing the ForwardedRegWorklist set to a map in which the worklist registers always map to the parameter registers that they describe. Another thing this patch fixes is that worklist registers now can describe more than one parameter register at a time. Such a case occurred in dbgcall-site-interpretation.mir, resulting in a call site entry not being emitted for one of the parameters. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73168	2020-01-27 12:41:42 +01:00
David Stenberg	b46baa82fc	Don't separate imp/expl def handling for call site params Summary: Since D70431 the describeLoadedValue() hook takes a parameter register, meaning that it can now be asked to describe any register. This means that we can drop the difference between explicit and implicit defines that we previously had in collectCallSiteParameters(). I have not found any case for any upstream targets where a parameter register is only implicitly defined, and does not overlap with any explicit defines. I don't know if such a case would even make sense. So as far as I have tested, this patch should be a non-functional change. However, this reduces the complexity of the code a bit, and it will simplify the implementation of an upcoming patch which solves PR44118. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73167	2020-01-27 11:31:09 +01:00
Petar Avramovic	cbf03aee6d	[MIPS GlobalISel] Select population count (popcount) G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates these intrinsics from __builtin_popcount and __builtin_popcountll. Add lower and narrow scalar for G_CTPOP. Lower G_CTPOP for MIPS32. Differential Revision: https://reviews.llvm.org/D73216	2020-01-27 09:59:50 +01:00
Petar Avramovic	8bc7ba5b9e	[MIPS GlobalISel] Select count trailing zeros llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_ctz and __builtin_ctzll. G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true). Clang generates such intrinsics as parts of expansion of builtin_ffs and builtin_ffsll. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ. Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32. Differential Revision: https://reviews.llvm.org/D73215	2020-01-27 09:51:06 +01:00
Petar Avramovic	2b66d32f3f	[MIPS GlobalISel] Select count leading zeros llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. MIPS clz instruction returns 32 for zero input. G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_clz and __builtin_clzll. G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as second argument. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF). Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32. Differential Revision: https://reviews.llvm.org/D73214	2020-01-27 09:43:38 +01:00
Fangrui Song	941f20c3bd	[MachineVerifier] Simplify and delete LLVM_VERIFY_MACHINEINSTRS from a comment. NFC The environment variable has been unused since r228079.	2020-01-27 00:31:23 -08:00
Wang, Pengfei	17b8f96d65	[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI. Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes. Differential Revision: https://reviews.llvm.org/D72871	2020-01-27 10:38:05 +08:00
Simon Pilgrim	4a5f9d9faf	[TargetLowering] Respect recursive depth in SimplifyDemandedBits call to ComputeNumSignBits	2020-01-26 10:01:56 +00:00
Simon Pilgrim	3daa71ee00	[SelectionDAG] ComputeNumSignBits - add DemandedElts support for MIN/MAX ops	2020-01-25 20:21:14 +00:00
Simon Pilgrim	3f8916b2e8	[SelectionDAG] ComputeNumSignBits - add support for rotate non-uniform vector amounts	2020-01-25 19:15:05 +00:00
Simon Pilgrim	e3c26a9d1b	[SelectionDAG] ComputeNumSignBits - add support for rotate uniform vector amounts	2020-01-25 18:55:47 +00:00
Simon Pilgrim	c8de7c8f50	[TargetLowering] SimplifyDemandedBits - Remove ashr if all our demandedbits already match the sign bit Differential Revision: https://reviews.llvm.org/D73412	2020-01-25 17:36:46 +00:00
Vedant Kumar	802bec8961	Revert "Reland: [DWARF] Allow cross-CU references of subprogram definitions" ... as well as: Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info" This reverts commit `fa4701e197`. This reverts commit `79daafc903`. There have been reports of this assert getting hit: CalleeDIE && "Could not find DIE for call site entry origin	2020-01-24 18:07:54 -08:00
Quentin Colombet	5d87b5d202	[GISelKnownBits] Add support for PHIs Teach the GISelKnowBits analysis how to deal with PHI operations. PHIs are essentially COPYs happening on edges, so we can just reuse the code for COPY. This is NFC COPY-wise has we leave Depth untouched when calling computeKnownBitsImpl for COPYs, like it was before this patch. Increasing Depth is however required for PHIs as they may loop back to themselves and we would end up in an infinite loop if we were not increasing Depth. Differential Revision: https://reviews.llvm.org/D73317	2020-01-24 16:43:52 -08:00
@justice_adams (Justice Adams)	daee63f974	[SelectionDag] Updated FoldConstantArithmetic method signature in preparation for merge with FoldConstantVectorArithmetic Updated FoldConstantArithmetic method signature to match that of FoldConstantVectorArithmetic in preparation for merging the two functions together https://bugs.llvm.org/show_bug.cgi?id=36544 This is the first step in combining the various FoldConstantVectorArithmetic and FoldConstantVectorArithmetic functions into one FoldConstantArithmetic function. Differential Revision: https://reviews.llvm.org/D72870	2020-01-24 18:00:58 -05:00
Craig Topper	d3bf06bc81	[DAGCombiner] Add combine for (not (strict_fsetcc)) to create a strict_fsetcc with the opposite condition. Unlike the existing code that I modified here, I only handle the case where the strict_fsetcc has a single use. Not sure exactly how to handle multiples uses. Testing this on X86 is hard because we already have a other combines that get rid of lowered version of the integer setcc that this xor will eventually become. So this combine really just saves a bunch of extra nodes being created. Not sure about other targets. Differential Revision: https://reviews.llvm.org/D71816	2020-01-24 14:15:36 -08:00
Stanislav Mekhanoshin	be8e38cbd9	Correct NumLoads in clustering Scheduler sends NumLoads argument into shouldClusterMemOps() one less the actual cluster length. So for 2 instructions it will pass just 1. Correct this number. This is NFC for in tree targets. Differential Revision: https://reviews.llvm.org/D73292	2020-01-24 12:45:28 -08:00
Stanislav Mekhanoshin	7a94d4f4ee	Allow combining of extract_subvector to extract element Differential Revision: https://reviews.llvm.org/D73132	2020-01-24 10:50:26 -08:00
Fangrui Song	50a3ff30e1	[PatchableFunction] Allow empty entry MachineBasicBlock Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301	2020-01-24 09:42:48 -08:00
Tom Weaver	f5147765ba	[DebugInfo][LiveDebugValues] Teach Live Debug Values About Meta Instructions Previously LiveDebugValues pass would consider meta instructions that 'fiddle' with liveness of registers as register definitions when transfering register defs. This would mean that, for example, a KILL instruction would cause LiveDebugValues to terminate the range of an earlier DBG_VALUE instruction resulting in the none propogation of said DBG_VALUE instructions into later blocks. This patch adds the check and a helpful comment, fixes a test that previously tested for the broken behaviour by coincidence and adds a test specifically for this. reviewers: vsk, dstenb, djtodoro Differential Revision: https://reviews.llvm.org/D73210	2020-01-24 16:29:05 +00:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Simon Pilgrim	0b45c2264a	[SelectionDAG] rot(x, y) --> x iff ComputeNumSignBits(x) == BitWidth(x) Rotating an 0/-1 value by any amount will always result in the same 0/-1 value	2020-01-24 10:35:57 +00:00
Fangrui Song	22467e2595	Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070	2020-01-23 17:02:27 -08:00
Simon Pilgrim	e25eee4db7	[SelectionDAG] ComputeNumSignBits - add ISD::ADD demanded elts support	2020-01-23 17:48:07 +00:00
Sam Parker	05532575e8	[RDA] Skip debug values Skip debug instructions when iterating through a block to find uses. Differential Revision: https://reviews.llvm.org/D73273	2020-01-23 17:04:54 +00:00
Simon Pilgrim	0fec8acdd8	[SelectionDAG] ComputeNumSignBits - add ISD::ADD vector support Add missing handling for (ADD (AND X, 1), -1) uniform vectors	2020-01-23 16:42:12 +00:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Simon Pilgrim	fc5bbbf328	[SelectionDAG] ComputeNumSignBits - add ISD::SUB demanded elts support	2020-01-23 16:20:48 +00:00
Jay Foad	b482e1bfe2	[CodeGen] Make use of MachineInstrBuilder::getReg Reviewers: arsenm Subscribers: wdng, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73262	2020-01-23 13:38:13 +00:00
Sam Parker	0d1468db58	[NFC][RDA] Make the interface const Make all the public query methods const.	2020-01-23 13:32:11 +00:00
Simon Pilgrim	48d4ba8fb2	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call.	2020-01-23 13:31:37 +00:00
Guillaume Chatelet	279fa8e006	[Alignement][NFC] Deprecate untyped CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260	2020-01-23 13:34:32 +01:00
Simon Pilgrim	03cae086f4	[SelectionDAG] ComputeKnownBits - merge EXTRACT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits/ComputeNumSignBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits call.	2020-01-23 11:29:16 +00:00
Simon Pilgrim	98da49d979	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_SUBVECTOR known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call, additionally we only ever need original demanded elts of the base vector even if the index is unknown.	2020-01-23 11:29:15 +00:00
Djordje Todorovic	91b0956f38	[NFC][DwarfDebug] Use proper analog GNU attribute for the pc address The low_pc is analog to the DW_AT_call_return_pc, since it describes the return address after the call. The DW_AT_call_pc is the address of the call instruction, and we don't use it at the moment. Differential Revision: https://reviews.llvm.org/D73173	2020-01-23 12:15:35 +01:00
David Tenty	45a4aaea7f	[NFC][XCOFF] Refactor Csect creation into TargetLoweringObjectFile Summary: We create a number of standard types of control sections in multiple places for things like the function descriptors, external references and the TOC anchor among others, so it is possible for their properties to be defined inconsistently in different places. This refactor moves their creation and properties into functions in the TargetLoweringObjectFile class hierarchy, where functions for retrieving various special types of sections typically seem to reside. Note: There is one case in PPCISelLowering which is specific to function entry points which we don't address since we don't have access to the TLOF there. Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast Reviewed By: jasonliu, hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72347	2020-01-22 12:09:11 -05:00
Stanislav Mekhanoshin	2d0fcf786c	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Hiroshi Yamauchi	ddbc728828	[PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst. Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146	2020-01-22 08:36:54 -08:00
Sander de Smalen	4cf16efe49	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Jay Foad	e0f0d0e55c	[MachineScheduler] Allow clustering mem ops with complex addresses The generic BaseMemOpClusterMutation calls into TargetInstrInfo to analyze the address of each load/store instruction, and again to decide whether two instructions should be clustered. Previously this had to represent each address as a single base operand plus a constant byte offset. This patch extends it to support any number of base operands. The old target hook getMemOperandWithOffset is now a convenience function for callers that are only prepared to handle a single base operand. It calls the new more general target hook getMemOperandsWithOffset. The only requirements for the base operands returned by getMemOperandsWithOffset are: - they can be sorted by MemOpInfo::Compare, such that clusterable ops get sorted next to each other, and - shouldClusterMemOps knows what they mean. One simple follow-on is to enable clustering of AMDGPU FLAT instructions with both vaddr and saddr (base register + offset register). I've left a FIXME in the code for this case. Differential Revision: https://reviews.llvm.org/D71655	2020-01-22 14:28:24 +00:00
Simon Pilgrim	80656fd7ae	[SelectionDAG] getShiftAmountConstant - assert the type is an integer.	2020-01-22 13:52:44 +00:00
Sander de Smalen	67d4c9924c	Add support for (expressing) vscale. In LLVM IR, vscale can be represented with an intrinsic. For some targets, this is equivalent to the constexpr: getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1 This can be used to propagate the value in CodeGenPrepare. In ISel we add a node that can be legalized to one or more instructions to materialize the runtime vector length. This patch also adds SVE CodeGen support for VSCALE, which maps this node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD] instructions (scaled multiples of 2, 4, or 8 bytes, respectively). Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner Reviewed by: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68203	2020-01-22 10:09:27 +00:00
Guillaume Chatelet	0957233320	[Alignment][NFC] Use Align with CreateMaskedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73106	2020-01-22 11:04:39 +01:00
Amara Emerson	67a8775322	[AArch64] Don't generate gpr CSEL instructions in early-ifcvt if regclasses aren't compatible. In GlobalISel we may in some unfortunate circumstances generate PHIs with operands that are on separate banks. If-conversion doesn't currently check for that case and ends up generating a CSEL on AArch64 with incorrect register operands. Differential Revision: https://reviews.llvm.org/D72961	2020-01-21 16:51:31 -08:00
Quentin Colombet	ff1f3cc1a1	[GISelKnownBits] Make the max depth a parameter of the analysis Allow users of that analysis to define the cut off depth of the analysis instead of hardcoding 6. NFC as the default parameter is 6.	2020-01-21 11:35:31 -08:00
Thomas Lively	3ef169e586	[WebAssembly][InstrEmitter] Foundation for multivalue call lowering Summary: WebAssembly is unique among upstream targets in that it does not at any point use physical registers to store values. Instead, it uses virtual registers to model positions in its value stack. This means that some target-independent lowering activities that would use physical registers need to use virtual registers instead for WebAssembly and similar downstream targets. This CL generalizes the existing `usesPhysRegsForPEI` lowering hook to `usesPhysRegsForValues` in preparation for using it in more places. One such place is in InstrEmitter for instructions that have variadic defs. On register machines, it only makes sense for these defs to be physical registers, but for WebAssembly they must be virtual registers like any other values. This CL changes InstrEmitter to check the new target lowering hook to determine whether variadic defs should be physical or virtual registers. These changes are necessary to support a generalized CALL instruction for WebAssembly that is capable of returning an arbitrary number of arguments. Fully implementing that instruction will require additional changes that are described in comments here but left for a follow up commit. Reviewers: aheejin, dschuff, qcolombet Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71484	2020-01-21 11:13:46 -08:00
Fangrui Song	7a8b0b1595	[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager() Reviewed By: dantrushin Differential Revision: https://reviews.llvm.org/D73063	2020-01-21 09:46:27 -08:00
Krzysztof Parzyszek	020041d99b	Update spelling of {analyze,insert,remove}Branch in strings and comments These names have been changed from CamelCase to camelCase, but there were many places (comments mostly) that still used the old names. This change is NFC.	2020-01-21 10:15:38 -06:00
Simon Pilgrim	f04284cf1d	[TargetLowering] SimplifyDemandedBits ISD::SRA multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 15:12:07 +00:00
Simon Pilgrim	47f99d2ca8	[SelectionDAG] GetDemandedBits - remove ANY_EXTEND handling Rely on SimplifyMultipleUseDemandedBits fallback instead.	2020-01-21 14:39:00 +00:00
Simon Pilgrim	651fa669a2	[TargetLowering] SimplifyDemandedBits ANY_EXTEND/ANY_EXTEND_VECTOR_INREG multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 14:07:19 +00:00
Simon Pilgrim	5f5f478564	[DAG] Fold extract_vector_elt (scalar_to_vector), K to undef (K != 0) This was unconditionally folding this to the source operand, even if the access was out of bounds. Use undef instead of the extract is not the first element. This helps with some cases where 3-vectors are legalized and avoids processing the 4th component. Original Patch by: arsenm (Matt Arsenault) Differential Revision: https://reviews.llvm.org/D51589	2020-01-21 10:58:30 +00:00
Simon Pilgrim	8d2e6bdbe1	[TargetLowering] SimplifyDemandedBits - Pull out InDemandedMask variable to ISD::SHL. NFCI. Matches ISD::SRA + ISD::SRL variants.	2020-01-21 10:40:18 +00:00
Fangrui Song	d232c21566	[AsmPrinter] Don't emit __patchable_function_entries entry if "patchable-function-entry"="0" Add improve tests	2020-01-20 16:13:48 -08:00
Simon Pilgrim	9c06c10fba	[SelectionDAG] GetDemandedBits - fallback to SimplifyMultipleUseDemandedBits by default. First step towards removing SelectionDAG::GetDemandedBits entirely since it so similar to SimplifyMultipleUseDemandedBits anyhow.	2020-01-20 16:51:52 +00:00
Awanish Pandey	84c4c87e04	Recommit "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions." Summary: This was reverted in `328e0f3dca` due to chromium bot failure. This revision addresses that case. Original commit message: Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524	2020-01-20 15:13:13 +05:30
Fangrui Song	eaab1bf21e	[StackColoring] Remap FixedStackPseudoSourceValue frame index referenced by MachineMemOperand StackColoring::remapInstructions() remaps MachineOperand frame index (e.g. %stack.1 -> %stack.0) but does not remap FixedStackPseudoSourceValue frame index (e.g. store 4 into %stack.1.ap2.i.i) referenced by MachineMemoryOperand. This can cause an assertion failure when LiveDebugValues references a dead stack object. It is difficult to craft a test case. -g, va_copy and stack-coloring are required. I can only reproduce it on ppc32.	2020-01-19 22:53:45 -08:00
Fangrui Song	886d2c2ca7	[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets() If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, but it can create more block splits.	2020-01-19 16:02:16 -08:00
Fangrui Song	9a24488cb6	[CodeGen] Move fentry-insert, xray-instrumentation and patchable-function before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit.	2020-01-19 00:09:46 -08:00
Fangrui Song	9583a3f262	[AsmPrinter] Delete dead takeDeletedSymbsForFunction() The code added in r98579 is dead now.	2020-01-18 17:08:00 -08:00
Michael Liao	6d0d86a64d	[DAG] Add helper for creating constant vector index with correct type. NFC.	2020-01-18 01:23:36 -05:00
David Blaikie	58b10df54f	DebugInfo: Move SectionLabel tracking into CU's addRange This makes the SectionLabel handling more resilient - specifically for future PROPELLER work which will have more CU ranges (rather than just one per function). Ultimately it might be nice to make this more general/resilient to arbitrary labels (rather than relying on the labels being created for CU ranges & then being reused by ranges, loclists, and possibly other addresses). It's possible that other (non-rnglist/loclist) uses of addresses will need the addresses to be in SectionLabels earlier (eg: move the CU.addRange to be done on function begin, rather than function end, so during function emission they are already populated for other use).	2020-01-17 18:12:34 -08:00
Derek Schuff	ff171acf84	[WebAssembly] Track frame registers through VReg and local allocation This change has 2 components: Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It describes how the Dwarf frame base will be encoded. That can be a register (the default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or a DW_OP_WASM_location descriptr. WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs has run. Make WebAssemblyExplicitLocals store the local it allocates for the frame register. Use this local information to implement getDwarfFrameBase The result is that the DW_AT_frame_base attribute is correctly encoded for each subprogram, and each param and local variable has a correct DW_AT_location that uses DW_OP_fbreg to refer to the frame base. This is a reland of rG3a05c3969c18 with fixes for the expensive-checks and Windows builds Differential Revision: https://reviews.llvm.org/D71681	2020-01-17 17:23:56 -08:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Reid Kleckner	423e3db6a8	Remove unneeded FoldingSet.h include from Attributes.h Avoids 637 extra FoldingSet.h and Allocator.h includes. FoldingSet.h needs Allocator.h, which is relatively expensive.	2020-01-17 16:36:09 -08:00
Evgenii Stepanov	d081962dea	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and adds _untied variants of those pseudos that are allowed to take the base address as a FI operand. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-17 15:19:29 -08:00
Ian Levesque	97ba483026	[xray] Allow instrumenting only function entry and/or only function exit Extend -fxray-instrumentation-bundle to split function-entry and function-exit into two separate options, so that it is possible to instrument only function entry or only function exit. For use cases that only care about one or the other this will save significant overhead and code size. Differential Revision: https://reviews.llvm.org/D72890	2020-01-17 13:32:34 -08:00

... 14 15 16 17 18 ...

29326 Commits