llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	9236125ec8	GlobalISel: Preserve LLT when bitcasting loads and stores This also avoids improperly legalizing some truncating vector stores.	2021-07-19 11:30:14 -04:00
Roman Lebedev	5b51bd1878	[TLI] prepareSREMEqFold(): use correct VT for the final VSELECT (PR51133) We were using the wrong VT for this final VSELECT, it should be in the final comparison VT, not the source value's VT. Fixes https://bugs.llvm.org/show_bug.cgi?id=51133	2021-07-19 16:44:00 +03:00
Eli Friedman	6601be4419	[X86] Remove incorrect use of known bits in shuffle simplification. This reverts commit `2a419a0b99`. The result of a shufflevector must not propagate poison from any element other than the one noted in the shuffle mask. The regressions outside of fptoui-may-overflow.ll can probably be recovered some other way; for example, using isGuaranteedNotToBePoison. See discussion on https://reviews.llvm.org/D106053 for more background. Differential Revision: https://reviews.llvm.org/D106222	2021-07-18 18:13:11 -07:00
Simon Pilgrim	fd7a54c709	[DAG] DAGCombiner::foldSelectOfBinops - propagate the common flags to the merged binop As discussed on D106058 - we were failing to keep the common flags. This matches the behaviour in InstCombinerImpl::foldSelectOpOp.	2021-07-18 18:38:59 +01:00
Simon Pilgrim	5643be96bc	[DAG] Enable foldSelectOfBinops on select(setcc(),binop(),binop()) calls	2021-07-18 18:38:59 +01:00
Simon Pilgrim	1a6a8443c2	[DAG] Move select(cc, binop(), binop()) folds into DAGCombiner::foldSelectOfBinops. NFCI. I'm going to extend the functionality started in D106058 so move the folds into their own method to reduce the amount of code in DAGCombiner::visitSELECT	2021-07-18 14:54:41 +01:00
Amara Emerson	4c55cdb00a	[GlobalISel] Fix known bits for G_BSWAP and B_BITREVERSE not doing anything. llvm::KnownBits::byteSwap() and reverse() don't modify in-place, so we weren't actually computing anything. This was causing a miscompile on an arm64 stage2 bootstrap clang build.	2021-07-17 23:07:16 -07:00
Kazu Hirata	1993b73755	[Analaysis, CodeGen] Remove getHotSucc (NFC) These functions seem to be unused for at least 5 years.	2021-07-17 07:31:36 -07:00
Amara Emerson	9637848f51	[GlobalISel] Fix non-pow-2 legalization of s56 stores. s56 stores are broken down into s32 + s24 stores. During this step both of those new stores use an anyextended s64 value, resulting in truncating stores. With s56, the s24 requires another lower step to make it legal, and we were crashing because we didn't expect non-pow-2 stores to also be truncating as well. Differential Revision: https://reviews.llvm.org/D106183	2021-07-16 13:29:49 -07:00
Guozhi Wei	5609c8b607	[X86FixupLEAs] Try again to transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D104684	2021-07-16 10:16:03 -07:00
Jon Roelofs	6c40abb6fe	Revert "[MachineVerifier] Diagnose invalid INSERT_SUBREGs" This reverts commit `dd57ba1a17`. It broke some tests: http://45.33.8.238/linux/51314/step_12.txt	2021-07-16 09:53:55 -07:00
Simon Pilgrim	95995673d1	[DAG] SelectionDAG::MaskedElementsAreZero - assert we're calling with a vector. NFCI. Add an assertion that we've calling MaskedElementsAreZero with a vector op and that the DemandedElts arg is a matching width. Makes the error a lot easier to grok when something else accidentally gets used.	2021-07-16 17:43:35 +01:00
Jon Roelofs	dd57ba1a17	[MachineVerifier] Diagnose invalid INSERT_SUBREGs Differential revision: https://reviews.llvm.org/D105953	2021-07-16 09:43:12 -07:00
Matt Arsenault	5a0d940f2a	GlobalISel: Preserve memory type for memset expansion	2021-07-16 11:41:32 -04:00
Matt Arsenault	f57f8f7ccc	GlobalISel: Remove dead function	2021-07-16 08:59:25 -04:00
Jeremy Morse	231bf52119	[InstrRef][FastISel] Support emitting DBG_INSTR_REF from fast-isel If you attach __attribute__((optnone)) to a function when using optimisations, that function will use fast-isel instead of the usual SelectionDAG method. This is a problem for instruction referencing, because it means DBG_VALUEs of virtual registers will be created, triggering some safety assertions in LiveDebugVariables. Those assertions exist to detect exactly this scenario, where an unexpected piece of code is generating virtual register references in instruction referencing mode. Fix this by transforming the DBG_VALUEs created by fast-isel into half-formed DBG_INSTR_REFs, after which they get patched up in finalizeDebugInstrRefs. The test modified adds a fast-isel mode to the instruction referencing isel test. Differential Revision: https://reviews.llvm.org/D105694	2021-07-16 13:56:15 +01:00
Matt Arsenault	a2d7ace3e3	GlobalISel: Surface offsets parameter from ComputeValueVTs	2021-07-15 19:11:40 -04:00
Matt Arsenault	e91da668d0	GlobalISel: Track argument pointeriness with arg flags Since we're still building on top of the MVT based infrastructure, we need to track the pointer type/address space on the side so we can end up with the correct pointer LLTs when interpreting CCValAssigns.	2021-07-15 19:11:40 -04:00
Amara Emerson	4e3dc6b8dd	GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI. This adds some level of type safety, allows helper functions to be added for specific opcodes for free, and also allows us to succinctly check for class membership with the usual dyn_cast/isa/cast functions. To start off with, add variants for the different load/store operations with some places using it. Differential Revision: https://reviews.llvm.org/D105751	2021-07-15 15:21:57 -07:00
Jessica Paquette	5da0f9ab61	[GlobalISel] Fix infinite loop in reassociationCanBreakAddressingModePattern It didn't update the opcode while walking through G_INTTOPTR/G_PTRTOINT. Differential Revision: https://reviews.llvm.org/D106080	2021-07-15 10:09:07 -07:00
Simon Pilgrim	0aece73aba	[DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z)) Similar to the folds performed in InstCombinerImpl::foldSelectOpOp, this attempts to push a select further up to help merge a pair of binops. I'm primarily interested in select(cond,add(x,y),add(x,z)) folds to help expose pointer math (see https://bugs.llvm.org/show_bug.cgi?id=51069 etc.) but I've tried to use the more generic isBinOp(). Differential Revision: https://reviews.llvm.org/D106058	2021-07-15 16:08:30 +01:00
Tim Northover	5d7632ee72	MachO: don't emit L... private symbols in do_not_dead_strip sections. The linker can sometimes drop the do_not_dead_strip if it can't associate the atom with a symbol (the other place to specify no dead-stripping in MachO files).	2021-07-15 14:40:43 +01:00
Djordje Todorovic	fa2daaeff8	[2/2][RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This patch adds the forward scan for finding redundant DBG_VALUEs. This analysis aims to remove redundant DBG_VALUEs by going forward in the basic block by considering the first DBG_VALUE as a valid until its first (location) operand is not clobbered/modified. For example: (1) DBG_VALUE $edi, !"var1", ... (2) <block of code that does affect $edi> (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (3). Differential Revision: https://reviews.llvm.org/D105280	2021-07-15 00:08:31 -07:00
Kai Luo	b9c3941cd6	[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614	2021-07-15 01:12:09 +00:00
Stanislav Mekhanoshin	76b7d3432e	[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization Any def of EXEC prevents rematerialization of any VOP instruction because of the physreg use. Create a callback to check if the physreg use can be ingored to allow rematerialization. Differential Revision: https://reviews.llvm.org/D105836	2021-07-14 13:03:58 -07:00
Eli Friedman	1e30bf8621	[SelectionDAG] Add an overload of getStepVector that assumes step 1. This is mostly a minor convenience, but the pattern seems frequent enough to be worthwhile (and we'll probably add more uses in the future). Differential Revision: https://reviews.llvm.org/D105850	2021-07-14 11:37:01 -07:00
Matt Arsenault	47269da5d8	GlobalISel: Handle lowering non-power-of-2 extloads	2021-07-14 11:54:11 -04:00
Djordje Todorovic	df686842bc	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Ruiling Song	40e3df2a1b	[RegisterCoalescer] Resolve conflict based on liveness of subregister Currently we are resolving lane/subregister conflict by visiting instructions sequentially in current block to see whether there is any use of the tainted lanes. To save compile time, we are not doing further check in successor blocks. This sounds reasonable without subgregister liveness. But since we have added subregister liveness tracking capability to register coalescer, we can easily determine whether we have subregister liveness conflict by checking subranges. This would help coalescing more COPYs for target that enables subregister liveness tracking. Reviewed by: arsenm, qcolombet Differential Revision: https://reviews.llvm.org/D104509	2021-07-14 14:43:22 +08:00
Hongtao Yu	74b99b5c2e	[CSSPGO] Do not import pseudo probe desc in thinLTO Previously we reliedy on pseudo probe descriptors to look up precomputed GUID during probe emission for inlined probes. Since we are moving to always using unique linkage names, GUID for functions can be computed in place from dwarf names. This eliminates the need of importing pseudo probe descs in thinlto, since those descs should be emitted by the original modules. This significantly reduces thinlto memory footprint in some extreme case where the number of imported modules for a single module is massive. Test Plan: Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D105248	2021-07-13 18:26:36 -07:00
Matt Arsenault	eebe841a47	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Guillaume Chatelet	2c47b8847e	Revert "[llvm] Add enum iteration to Sequence" This reverts commit `a006af5d6e`.	2021-07-13 16:44:42 +00:00
Guillaume Chatelet	a006af5d6e	[llvm] Add enum iteration to Sequence This patch allows iterating typed enum via the ADT/Sequence utility. Differential Revision: https://reviews.llvm.org/D103900	2021-07-13 16:22:19 +00:00
Matt Arsenault	222fde1eec	GlobalISel: Use extension instead of merge with undef in common case This fixes not respecting signext/zeroext in these cases. In the anyext case, this avoids a larger merge with undef and should be a better canonical form. This should also handle this if a merge is needed, but I'm not aware of a case where that can happen. In a future change this will also allow AMDGPU to drop some custom code without introducing regressions.	2021-07-13 11:04:47 -04:00
Matt Arsenault	77a608d9de	GlobalISel: Remove getIntrinsicID utility function This is redundant with a method directly on MachineInstr	2021-07-13 11:04:10 -04:00
Qiu Chaofan	954a15d639	[SelectionDAG] Check use before combining into USUBSAT Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D105789	2021-07-13 14:50:26 +08:00
Jessica Paquette	47d0780f45	[GlobalISel] Handle more types in narrowScalar for eq/ne G_ICMP Generalize the existing eq/ne case using `extractParts`. The original code only handled narrowings for types of width 2n->n. This generalization allows for any type that can be broken down by `extractParts`. General overview is: - Loop over each narrow-sized part and do exactly what the 2-register case did. - Loop over the leftover-sized parts and do the same thing - Widen the leftover-sized XOR results to the desired narrow size - OR that all together and then do the comparison against 0 (just like the old code) This shows up a lot when building clang for AArch64 using GlobalISel, so it's worth fixing. For the sake of simplicity, this doesn't handle the non-eq/ne case yet. Also remove the code in this case that notifies the observer; we're just going to delete MI anyway so talking to the observer shouldn't be necessary. Differential Revision: https://reviews.llvm.org/D105161	2021-07-12 22:18:50 -07:00
Arthur Eubanks	7987c46273	[OpaquePtr][ISel] Use ArgListEntry::IndirectType more	2021-07-12 21:14:35 -07:00
Eli Friedman	ec1cdee6aa	[SelectionDAG][RISCV] Support @llvm.vscale.i64() on 32-bit targets. Not really useful on its own, but D105673 depends on it. Differential Revision: https://reviews.llvm.org/D105840	2021-07-12 14:53:42 -07:00
Jinsong Ji	28fb69e00a	[AIX] Emit version string in .file directive AIX .file directive support including compiler version string. https://www.ibm.com/docs/en/aix/7.2?topic=ops-file-pseudo-op This patch adds the support so that it will be easier to identify build compiler in objects. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D105743	2021-07-12 17:03:52 +00:00
Bradley Smith	112c09039b	[SelectionDAG] Simplify PromoteIntRes_INSERT_SUBVECTOR to only handle result Let other parts of legalization handle the rest of the node, this allows re-use of existing optimizations elsewhere. Differential Revision: https://reviews.llvm.org/D105624	2021-07-12 15:20:44 +00:00
David Truby	c305557acd	[llvm][sve] Lowering for VLS truncating stores This adds custom lowering for truncating stores when operating on fixed length vectors in SVE. It also includes a DAG combine to fold extends followed by truncating stores into non-truncating stores in order to prevent this pattern appearing once truncating stores are supported. Currently truncating stores are not used in certain cases where the size of the vector is larger than the target vector width. Differential Revision: https://reviews.llvm.org/D104471	2021-07-12 11:14:17 +01:00
David Green	dc0bbc9d89	[IfCvt] Don't use pristine register for counting liveins for predicated instructions. The test case here hits machine verifier problems. There are volatile long loads that the results of do not get used, loading into two dead registers. IfCvt will predicate them and as it does will add implicit uses of the predicating registers due to thinking they are live in. As nothing has used the register, the machine verifier disagrees that they are really live and we end up with a failure. The registers come from Pristine regs that LivePhysRegs counts as live. This patch adds a addLiveInsNoPristines method to be used instead in IfCvt, so that only really live in regs need to be added as implicit operands. Differential Revision: https://reviews.llvm.org/D90965	2021-07-11 14:45:54 +01:00
Amara Emerson	97c426394a	[AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR. Differential Revision: https://reviews.llvm.org/D103301	2021-07-10 00:25:26 -07:00
Amara Emerson	58a2cb5143	[GlobalISel] Add a new artifact combiner for unmerge which looks through general artifact expressions. The original motivation for this was to implement moreElementsVector of shuffles on AArch64, which resulted in complex sequences of artifacts like unmerge(unmerge(concat...)) which the combiner couldn't handle. It seemed here that the better option, instead of writing ever-more-complex combines, was to have a way to find the original "non-artifact" source registers for a given definition, walking through arbitrary expressions of unmerge/concat/insert. As long as the bits aren't extended or truncated, this is a pretty simple algorithm that avoids the need for lots of combines and instead jumps straight to the final result we want. I've only used this new technique in 2 places within tryCombineUnmerge, using it in more general situations resulted in infinite loops in AMDGPU. So for now it's used when we would otherwise fail to combine and that seems to work. In order to support looking through G_INSERTs, I also had to add it as an artifact in isArtifact(), which caused a whole lot of issues in tests. AMDGPU started infinite looping since full legalization of G_INSERT doensn't seem to be there. To work around this, I've temporarily added a CLI option to use the old behaviour so that the MIR tests will still run and terminate. Other minor changes include no longer making >128b G_MERGE/UNMERGE legal. We never had isel support for that anyway and it was a remnant of the legacy legalizer rules. However being legal prevented the combiner from checking if it was dead and deleting them. Differential Revision: https://reviews.llvm.org/D104355	2021-07-09 22:35:00 -07:00
Jessica Paquette	47aeeffc8f	[GlobalISel] Use GCDTy when extracting GCD ty from leftover regs in insertParts `LegalizerHelper::insertParts` uses `extractGCDType` on registers split into a desired type and a smaller leftover type. This is used to populate a list of registers. Each register in the list will have the same type as returned by `extractGCDType`. If we have - `ResultTy` = s792 - `PartTy` = s64 - `LeftoverTy` = s24 When we call `extractGCDType`, we'll end up with two different types appended to the list: Part: gcd(792, 64, 24) => s8 Leftover: gcd(792, 24, 24) => s24 When this happens, we'll hit an assert while trying to build a G_MERGE_VALUES. This patch changes the code for the leftover type so that we reuse the GCD from the desired type. e.g. Leftover: gcd(792, 8, 24) => s8 https://llvm.godbolt.org/z/137Kqxj6j Differential Revision: https://reviews.llvm.org/D105674	2021-07-09 14:15:44 -07:00
Wouter van Oortmerssen	9647a6f719	[WebAssembly] Added initial type checker to MC Assembler This to protect against non-sensical instruction sequences being assembled, which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate. Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input. Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly. A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct. This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future. Differential Revision: https://reviews.llvm.org/D104945	2021-07-09 14:07:25 -07:00
Jeremy Morse	30cce54dad	[X86] Return src/dest register from stack spill/restore recogniser LLVM provides target hooks to recognise stack spill and restore instructions, such as isLoadFromStackSlot, and it also provides post frame elimination versions such as isLoadFromStackSlotPostFE. These are supposed to return the store-source and load-destination registers; unfortunately on X86, the PostFE recognisers just return "1", apparently to signify "yes it's a spill/load". This patch alters the hooks to correctly return the store-source and load-destination registers: This is really useful for debug-info as we it helps follow variable values as they move on/off the stack. There should be no codegen changes: the only other users of these PostFE target hooks are MachineInstr::getRestoreSize and MachineInstr::getSpillSize, which don't attempt to interpret the returned register location. While we're here, delete the (InstrRef) LiveDebugValues heuristic that tries to find the spill source register by looking for a killed reg -- we should be able to rely on the target hooks for that. This involves temporarily turning off a n InstrRef LivedDebugValues test on aarch64 (patch to re-enable it is in D104521). Differential Revision: https://reviews.llvm.org/D105428	2021-07-09 18:12:30 +01:00
Jeremy Morse	f551fb96c7	[Debug-info][InstrRef] Avoid an unnecessary map ordering We keep a record of substitutions between debug value numbers post-isel, however we never actually look them up until the end of compilation. As a result, there's nothing gained by the collection being a std::map. This patch downgrades it to being a vector, that's then sorted at the end of compilation in LiveDebugValues. Differential Revision: https://reviews.llvm.org/D105029	2021-07-09 15:43:13 +01:00
Muhammad Omair Javaid	932e3d9960	Revert "GlobalISel/AArch64: don't optimize away redundant branches at -O0" This reverts commit `458c230b5e`. This broke LLDB buildbot testcase where breakpoint set at start of loop failed to hit. https://lab.llvm.org/buildbot/#/builders/96/builds/9404 https://github.com/llvm/llvm-project/blob/main/lldb/test/API/commands/process/attach/main.cpp#L15 Differential Revision: https://reviews.llvm.org/D105238	2021-07-09 08:23:36 +05:00
David Blaikie	1def2579e1	PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23 C++23 will make these conversions ambiguous - so fix them to make the codebase forward-compatible with C++23 (& a follow-up change I've made will make this ambiguous/invalid even in <C++23 so we don't regress this & it generally improves the code anyway)	2021-07-08 13:37:57 -07:00
Matt Arsenault	9b057f647d	GlobalISel: Track original argument index in ArgInfo SelectionDAG's equivalents in ISD::InputArg/OutputArg track the original argument index. Mips relies on this, and its currently reinventing its own parallel CallLowering infrastructure which tracks these indexes on the side. Add this to help move towards deleting the custom mips handling.	2021-07-08 13:39:02 -04:00
Jeremy Morse	63cc251eb9	[DebugInfo][InstrRef][4/4] Support DBG_INSTR_REF through all backend passes This is a cleanup patch -- we're now able to support all flavours of variable location in instruction referencing mode. This patch updates various tests for debug instructions to be broader: numerous code paths try to ignore debug isntructions, and they now have to ignore the additional DBG_PHI and DBG_INSTR_REFs that we can generate. A small amount of rework happens for LiveDebugVariables: as we don't need to track live intervals through regalloc any more, we can get away with unlinking debug instructions before regalloc, then re-inserting them after. Note that this isn't (yet) true of DBG_VALUE_LISTs, they still have to go through live interval tracking. In SelectionDAG, add a helper lambda that emits half-formed DBG_INSTR_REFs for arguments in instr-ref mode, DBG_VALUE otherwise. This is one of the final locations where DBG_VALUEs are emitted for vreg arguments. X86InstrInfo now un-sets the debug instr number on SUB instructions that get mutated into CMP instructions. As the instruction no longer computes a subtraction, we can't use it for variable locations. Differential Revision: https://reviews.llvm.org/D88898	2021-07-08 16:42:24 +01:00
Stanislav Mekhanoshin	0fdb25cd95	[AMDGPU] Disable garbage collection passes Differential Revision: https://reviews.llvm.org/D105593	2021-07-07 15:47:57 -07:00
Arthur Eubanks	aad41e2299	[OpaquePtr] Use ArgListEntry::IndirectType for lowering ABI attributes Consolidate PreallocatedType and ByValType into IndirectType, and use that for inalloca.	2021-07-07 14:58:38 -07:00
Adrian Prantl	458c230b5e	GlobalISel/AArch64: don't optimize away redundant branches at -O0 This patch prevents GlobalISel from optimizing out redundant branch instructions when compiling without optimizations. The motivating example is code like the following common pattern in Swift, where users expect to be able to set a breakpoint on the early exit: public func f(b: Bool) { guard b else { return // I would like to set a breakpoint here. } ... } The patch modifies two places in GlobalISEL: The first one is in IRTranslator.cpp where the removal of redundant branches is made conditional on the optimization level. The second one is in AArch64InstructionSelector.cpp where an -O0 only optimization is being removed. Disabling these optimizations increases code size at -O0 by ~8%. However, doing so improves debuggability, and debug builds are the primary reason why developers compile without optimizations. We thus concluded that this is the right trade-off. rdar://79515454 Differential Revision: https://reviews.llvm.org/D105238	2021-07-07 12:51:55 -07:00
Dylan Fleming	8ae9ab43dd	[SVE] Fixed cast<FixedVectorType> on scalable vector in SelectionDAGBuilder::getUniformBase Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105350	2021-07-07 10:48:17 +01:00
David Green	4ce26deac2	[DAG] Reassociate Add with Or We already have reassociation code for Adds and Ors separately in DAG combiner, this adds it for the combination of the two where Ors act like Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where we know that the Ors operands have no common bits set, and the Or has one use. Differential Revision: https://reviews.llvm.org/D104765	2021-07-07 10:21:07 +01:00
Jeremy Morse	2b2ffb7bdc	[DebugInfo][InstrRef][3/4] Produce DBG_INSTR_REFs for all variable locations This patch emits DBG_INSTR_REFs for two remaining flavours of variable locations that weren't supported: copies, and inter-block VRegs. There are still some locations that must be represented by DBG_VALUE such as constants, but they're mostly independent of optimisations. For variable locations that refer to values defined in different blocks, vregs are allocated before isel begins, but the defining instruction might not exist until late in isel. To get around this, emit DBG_INSTR_REFs in a "half done" state, where the first operand refers to a VReg. Then at the end of isel, patch these back up to refer to instructions, using the finalizeDebugInstrRefs method. Copies are something that I complained about the original RFC, and I really don't want to have to put instruction numbers on copies. They don't define a value: they move them. To address this isel, salvageCopySSA interprets: * COPYs, * SUBREG_TO_REG, * Anything that isCopyInstr thinks is a copy. And follows chains of copies back to the defining instruction that they read from. This relies on any physical registers that COPYs read being defined in the same block, or being entry-block arguments. For the former we can put an instruction number on the defining instruction; for the latter we can drop a DBG_PHI that reads the incoming value. Differential Revision: https://reviews.llvm.org/D88896	2021-07-06 18:31:38 +01:00
David Stuttard	83cb9632a1	[DAGCombiner] Add support for mulhi const folding in DAGCombiner Differential Revision: https://reviews.llvm.org/D103323 Change-Id: I4ffaaa32301795ba8a339567a68e77fe0862b869	2021-07-05 12:01:26 +01:00
Stephen Tozer	14b62f7e2f	[DebugInfo] CGP+HWasan: Handle dbg.values with duplicate location ops This patch fixes an issue which occurred in CodeGenPrepare and HWAddressSanitizer, which both at some point create a map of Old->New instructions and update dbg.value uses of these. They did this by iterating over the dbg.value's location operands, and if an instance of the old instruction was found, replaceVariableLocationOp would be called on that dbg.value. This would cause an error if the same operand appeared multiple times as a location operand, as the first call to replaceVariableLocationOp would update all uses of the old instruction, invalidating the old iterator and eventually hitting an assertion. This has been fixed by no longer iterating over the dbg.value's location operands directly, but by first collecting them into a set and then iterating over that, ensuring that we never attempt to replace a duplicated operand multiple times. Differential Revision: https://reviews.llvm.org/D105129	2021-07-05 10:35:19 +01:00
Paul Walker	287d39dd5a	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Simon Pilgrim	80dd591610	[SelectionDAG] Replace APInt.lshr().trunc() with APInt.extractBits() where possible. NFC. This also allows us to use KnownBits::extractBits in one case.	2021-07-03 16:33:00 +01:00
Simon Pilgrim	e2e44c3da9	[SelectionDAG] Use KnownBits::insertBits instead of separate APInt::insertBits calls. NFC.	2021-07-03 16:32:59 +01:00
Craig Topper	af331e8284	[SelectionDAG] Rename memory VT argument for getMaskedGather/getMaskedScatter from VT to MemVT. Use getMemoryVT() in MGATHER/MSCATTER DAG combines instead of using the passthru or store value VT for this argument.	2021-07-02 17:37:40 -07:00
Jonas Devlieghere	52b5491a21	Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes" This reverts commit `8cd35ad854`. It breaks `TestMembersAndLocalsWithSameName.py` on GreenDragon and Mikael Holmén points out in D104827 that bitcode files created with the patch cannot be parsed with binaries built before it.	2021-07-02 15:57:07 -07:00
Amara Emerson	f30251f527	[GlobalISel] Clean up CombinerHelper::apply* functions to return void. For some reason we/I started writing these as returning bool when the return value is actually ignored by the combiner.	2021-07-02 13:17:06 -07:00
Amara Emerson	0111da2ef8	[GlobalISel] Add re-association combine for G_PTR_ADD to allow better addressing mode usage. We're trying to match a few pointer computation patterns here for re-association opportunities. 1) Isolating a constant operand to be on the RHS, e.g.: G_PTR_ADD(BASE, G_ADD(X, C)) -> G_PTR_ADD(G_PTR_ADD(BASE, X), C) 2) Folding two constants in each sub-tree as long as such folding doesn't break a legal addressing mode. G_PTR_ADD(G_PTR_ADD(BASE, C1), C2) -> G_PTR_ADD(BASE, C1+C2) AArch64 code size improvements on CTMark with -Os: Program before after diff pairlocalalign 251048 251044 -0.0% consumer-typeset 421820 421812 -0.0% kc 431348 431320 -0.0% SPASS 413404 413300 -0.0% clamscan 384396 384220 -0.0% tramp3d-v4 370640 370412 -0.1% lencod 432096 431772 -0.1% bullet 479400 478796 -0.1% sqlite3 288504 288072 -0.1% 7zip-benchmark 573796 570768 -0.5% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D105069	2021-07-02 12:31:21 -07:00
Krzysztof Parzyszek	df88c26f0d	[OpaquePtr] Add type parameter to emitLoadLinked Differential Revision: https://reviews.llvm.org/D105353	2021-07-02 13:07:40 -05:00
Jinsong Ji	03e9dcfd41	[AIX] Use AsmParser to do inline asm parsing Add a flag so that target can choose to use AsmParser for parsing inline asm. And set the flag by default for AIX. -no-intergrated-as will override this default if specified explicitly. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D105314	2021-07-02 16:12:21 +00:00
Alexandru Octavian Butiu	e90c6f5596	[MachineCopyPropagation] Fix differences in code gen when compiling with -g Fixes bugs [[ https://bugs.llvm.org/show_bug.cgi?id=50580 \| 50580 ]] and [[ https://bugs.llvm.org/show_bug.cgi?id=49446 \| 49446 ]] When compiling with -g "DBG_VALUE <reg>" instructions are added in the MIR, if such a instruction is inserted between instructions that use <reg> then MachineCopyPropagation invalidates <reg> , this causes some copies to not be propagated and causes differences in code generation (ex bugs 50580 and 49446 ). DBG_VALUE instructions should be ignored since they don't actually modify the register. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D104394	2021-07-02 19:27:06 +08:00
Roman Lebedev	c2c0d3ea89	Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `4facbf213c`. ``` ****************** FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468) **************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ****************** Script: -- : 'RUN: at line 1'; /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types \| /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll -- Exit Code: 2 Command Output (stderr): -- llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed. ```	2021-07-02 11:49:51 +03:00
Paulo Matos	4facbf213c	[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR Reland of `31859f896`. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Differential Revision: https://reviews.llvm.org/D104797	2021-07-02 09:46:28 +02:00
Craig Topper	066524ea54	[ScalarizeMaskedMemIntrin][SelectionDAGBuilder] Use the element type to calculate alignment for gather/scatter when alignment operand is 0. Previously we used the vector type, but we're loading/storing invididual elements so I think only element alignment should matter. Noticed while looking at the code for something else so I don't have a test case. Differential Revision: https://reviews.llvm.org/D105220	2021-07-01 19:08:47 -07:00
Jessica Paquette	e59f02216f	[GlobalISel] Translate <1 x N> getelementptrs to scalar G_PTR_ADDs In `IRTranslator::translateGetElementPtr`, when we run into a vector gep with some scalar operands, we try to normalize those operands using `buildSplatVector`. This is fine except for when the getelementptr has a <1 x N> type. In that case it is treated as a scalar. If we run into one of these then every call to ``` // With VectorWidth = 1 LLT::fixed_vector(VectorWidth, PtrTy) ``` will assert. Here's an example (equivalent to the added testcase): https://godbolt.org/z/hGsTnMYdW To get around this, this patch adds a variable, `WantSplatVector`, which is true when our vector type ought to actually be represented using a vector. When it's false, we'll translate as a scalar. This checks if `VectorWidth > 1`. This fixes this bug: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35496 Differential Revision: https://reviews.llvm.org/D105316	2021-07-01 16:38:47 -07:00
Jon Roelofs	14d64be6e5	[GISel] Print better error messages for missing Combiner Observer calls Differential revision: https://reviews.llvm.org/D105290	2021-07-01 15:18:18 -07:00
Matt Arsenault	99c7e918b5	GlobalISel: Use LLT in call lowering callbacks This preserves the memory type so the lowerings can rely on them.	2021-07-01 12:15:54 -04:00
Bradley Smith	2668727929	[SelectionDAG] Implement PromoteIntRes_INSERT_SUBVECTOR Inserting into a smaller-than-legal scalable vector would result in an internal compiler error. For example, inserting a <vscale x 4 x i8> into a <vscale x 8 x i8> (both illegal vector types for SVE) would cause a crash. This crash was happening because there was no code to promote (legalise) the result of an INSERT_SUBVECTOR node. This patch implements PromoteIntRes_INSERT_SUBVECTOR, which legalises the ISD node. This is currently done by going through memory. This is necessary because of the requirement that the SubVec parameter of the INSERT_SUBVECTOR node must be smaller than the Vec parameter, which means that INSERT_SUBVECTOR cannot always have a legal result/operand types. Co-Authored-by: Joe Ellis <joe.ellis@arm.com> Differential Revision: https://reviews.llvm.org/D102766	2021-07-01 17:05:53 +01:00
Jeremy Morse	e9641c911e	[DebugInfo][InstrRef][2/4] Use subreg substitutions in LiveDebugValues Added in `47c3fe2a22`, we sometimes need to describe a variable value substitution with a subregister qualifier, to say that "the value is the lower 32 bits of this 64 bit register def" for example. That then needs support during LiveDebugValues to interpret the subregister qualifiers, which is what this patch adds. Whenever we encounter a DBG_INSTR_REF and find its value by using a substitution, collect any subregister qualifiers seen. Then, accumulate the effects of the qualifiers to work out what offset and what size should be extracted from the defined register. Finally, for the target ValueIDNum, extract whatever subregister is in the correct position Currently, describing a subregister field of a larger value that has been spilt to the stack, is unimplemented. Differential Revision: https://reviews.llvm.org/D88894	2021-07-01 13:07:16 +01:00
Bradley Smith	01b846674d	[AArch64][SVE] Add support for fixed length MSCATTER/MGATHER Since gather lowering can now lower to nodes that may need expansion via the vector legalizer, do MGATHER lowering via vector legalizer. Additionally, as part of adding passthru support for fixed typed gathers, fix passthru support for scalable types. Depends on D104910 Differential Revision: https://reviews.llvm.org/D104217	2021-07-01 12:13:59 +01:00
Jeremy Morse	47c3fe2a22	[DebugInfo][InstrRef][1/4] Support transformations that widen values Very late in compilation, backends like X86 will perform optimisations like this: $cx = MOV16rm $rax, ... -> $rcx = MOV64rm $rax, ... Widening the load from 16 bits to 64 bits. SEeing how the lower 16 bits remain the same, this doesn't affect execution. However, any debug instruction reference to the defined operand now refers to a 64 bit value, nto a 16 bit one, which might be unexpected. Elsewhere in codegen, there's often this pattern: CALL64pcrel32 @foo, implicit-def $rax %0:gr64 = COPY $rax %1:gr32 = COPY %0.sub_32bit Where we want to refer to the definition of $eax by the call, but don't want to refer the copies (they don't define values in the way LiveDebugValues sees it). To solve this, add a subregister field to the existing "substitutions" facility, so that we can describe a field within a larger value definition. I would imagine that this would be used most often when a value is widened, and we need to refer to the original, narrower definition. Differential Revision: https://reviews.llvm.org/D88891	2021-07-01 11:19:27 +01:00
Qiu Chaofan	07f0faed11	[NFC][Scheduler] Refactor tryCandidate to return boolean This patch changes return type of tryCandidate from void to bool: 1. Methods in some targets already follow this convention. 2. This would help if some target wants to re-use generic code. 3. It looks more intuitive if these try-method returns the same type. We may need to change return type of them from bool to some enum further, to make it less confusing. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103951	2021-07-01 14:31:47 +08:00
Jeremy Morse	1575583f2a	[LiveDebugValues][InstrRef][2/2] Emit entry value variable locations This patch adds support to the instruction-referencing LiveDebugValues implementation for emitting entry values. The instruction referencing implementations tracking by value rather than location means that we can get around two of the issues with VarLocs. DBG_VALUE instructions that re-assign the same value to a variable are no longer a problem, because we can "see through" to the value being assigned. We also don't need to do anything special during the dataflow stages: the "variable value problem" doesn't need to know whether a value is available most of the time, and the times it deoes need to know are always when entry values need to be terminated. The patch modifies the "TransferTracker" class, adding methods to identify when a variable ias an entry value candidate, and when a machine value is an entry value. recoverAsEntryValue tests these two things and emits an entry-value expression if they're true. It's used when we clobber or otherwise lose a value and can't find a replacement location for the value it contained. Differential Revision: https://reviews.llvm.org/D88406	2021-06-30 23:07:39 +01:00
Matt Arsenault	28f2f66200	GlobalISel: Use LLT in memory legality queries This enables proper lowering of non-byte sized loads. We still aren't faithfully preserving memory types everywhere, so the legality checks still only consider the size.	2021-06-30 17:44:13 -04:00
Matt Arsenault	a601b308d9	GlobalISel: Lower non-byte loads and stores Previously we didn't preserve the memory type and had to blindly interpret a number of bytes. Now that non-byte memory accesses are representable, we can handle these correctly. Ported from DAG version (minus some weird special case i1 legality checking which I don't fully understand, and we don't have a way to query for) For now, this is NFC and the test changes are placeholders. Since the legality queries are still relying on byte-flattened memory sizes, the legalizer can't actually see these non-byte accesses. This keeps this change self contained without merging it with the larger patch to switch to LLT memory queries.	2021-06-30 17:05:50 -04:00
Matt Arsenault	748e0b07dc	GlobalISel: Preserve memory type when reducing load/store width	2021-06-30 17:05:29 -04:00
Matt Arsenault	fae05692a3	CodeGen: Print/parse LLTs in MachineMemOperands This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few). Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions. This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type.	2021-06-30 16:54:13 -04:00
Jon Roelofs	a642872476	[GISel] Support llvm.memcpy.inline Differential revision: https://reviews.llvm.org/D105072	2021-06-30 12:39:05 -07:00
Jeremy Morse	4955544162	[LiveDebugValues][InstrRef][1/2] Recover more clobbered variable locations In various circumstances, when we clobber a register there may be alternative locations that the value is live in. The classic example would be a value loaded from the stack, and then clobbered: the value is still available on the stack. InstrRefBasedLDV was coping with this at block starts where it's forced to pick a location, however it wasn't searching for alternative locations when values were clobbered. This patch notifies the "Transfer Tracker" object when clobbers occur, and it's able to find alternatives and issue DBG_VALUEs for that location. See: the added test. Differential Revision: https://reviews.llvm.org/D88405	2021-06-30 16:56:25 +01:00
Bradley Smith	002911503f	[TargetLowering][AArch64][SVE] Take into account accessed type when clamping address When clamping the index for a memory access to a stacked vector we must take into account the entire type being accessed, not just assume that we are accessing only a single element. Differential Revision: https://reviews.llvm.org/D105016	2021-06-30 13:30:18 +01:00
David Blaikie	632e15e766	Conditionalize function only used in an assert to address -Wunused-function	2021-06-29 16:39:59 -07:00
Matt Arsenault	990278d026	CodeGen: Store LLT instead of uint64_t in MachineMemOperand GlobalISel is relying on regular MachineMemOperands to track all of the memory properties of accesses. Just the raw byte size is insufficent to disambiguate all situations. For example, if we need to split an unaligned extending load, we need to know the number of bits in the original source value and can't infer it from the result type. This is also a problem for extending vector loads. This does decrease the maximum representable size from the full uint64_t bytes to a maximum of 16-bits. No in tree testcases hit this, other than places using UINT64_MAX for unknown sizes. This may be an issue for G_MEMCPY and co., although they can just use unknown size for large static sizes. This also has potential for backend abuse by relying on the type when it really shouldn't be relevant after selection. This does not include the necessary MIR printer/parser changes to represent this.	2021-06-29 17:38:51 -04:00
Matt Arsenault	49fa6abf74	Revert "GlobalISel: Use MMO helper for getting the size in bits" This reverts commit `dc98adfb44`. This should still be done, but this is currently causing some commit ordering issues.	2021-06-29 17:38:51 -04:00
Craig Topper	9132299836	[LegalizeTypes][VE] Don't Expand BITREVERSE/BSWAP during type legalization promotion if they will be promoted for NVT in op legalization. We were trying to expand these if they were going to be expanded in op legalization so that we generated the minimum number of operations. We failed to take into account that NVT could be promoted to another legal type in op legalization. Hoping this fixes the issue on the VE target reported as a follow up to D96681. The check line changes were taken from before `1e46b6f401` so this patch does appear to improve some cases that had previously regressed.	2021-06-29 11:00:11 -07:00
Jeremy Morse	e63b18bc84	Catch an extremely obvious memory leak, thanks asan https://lab.llvm.org/buildbot/#/builders/5/builds/9208 (dbg-phis-merging-in-ldv.mir and dbg-phis-with-loops.mir in the asan check stage)	2021-06-29 15:47:17 +01:00
Jeremy Morse	010108bb2c	[DebugInstrRef][3/3] Follow DBG_PHI instructions through LiveDebugValues This patch reads machine value numbers from DBG_PHI instructions (marking where SSA PHIs used to be), and matches them up with DBG_INSTR_REF instructions that refer to them. Essentially they are two separate parts of a DBG_VALUE: the place to read the value (register and program position), and where the variable is assigned that value. Sometimes these DBG_PHIs can be duplicated, usually by tail duplication. This corresponds to the SSA structure of the program being destroyed, and the original PHI being split. When this happens: run LLVMs standard SSAUpdater utility, to work out what values should appear in which blocks. The majority of this patch is boilerplate to make use of SSAUpdater. If there are any additional PHIs on the path between multiple DBG_PHIs and their using DBG_INSTR_REF, their existance is validated, just in case a value gets clobbered along the way (see dbg-phis-with-loops.mir for several examples). Differential Revision: https://reviews.llvm.org/D86814	2021-06-29 14:45:13 +01:00
Michael Liao	e818eface8	[MIRParser] Add machine metadata. - Add standalone metadata parsing support so that machine metadata nodes could be populated before and accessed during MIR is parsed. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103282	2021-06-28 22:29:36 -04:00
Scott Linder	8cd35ad854	[DebugInfo] Enforce implicit constraints on `distinct` MDNodes Add UNIQUED and DISTINCT properties in Metadata.def and use them to implement restrictions on the `distinct` property of MDNodes: * DIExpression can currently be parsed from IR or read from bitcode as `distinct`, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be `distinct` anyway, as there is no syntax to describe it. * Similarly, DIArgList is conceptually always uniqued. It is currently restricted to only appearing in contexts where there is no syntax for `distinct`, but for consistency it is treated equivalently to DIExpression in this patch. * DICompileUnit is already restricted to always being `distinct`, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support. The new UNIQUED property applies to DIExpression and DIArgList, and forbids them to be `distinct`. It also implies they are canonically printed inline at each use, rather than via MDNode ID. The new DISTINCT property applies to DICompileUnit, and requires it to be `distinct`. A potential alternative change is to forbid the non-inline syntax for DIExpression entirely, as is done with DIArgList implicitly by requiring it appear in the context of a function. For example, we would forbid: !named = !{!0} !0 = !DIExpression() Instead we would only accept the equivalent inlined version: !named = !{!DIExpression()} This essentially removes the ability to create a `distinct` DIExpression by construction, as there is no syntax for `distinct` inline. If this patch is accepted as-is, the result would be that the non-canonical version is accepted, but the following would be an error and produce a diagnostic: !named = !{!0} ; error: 'distinct' not allowed for !DIExpression() !0 = distinct !DIExpression() Also update some documentation to consistently use the inline syntax for DIExpression, and to describe the restrictions on `distinct` for nodes where applicable. Reviewed By: StephenTozer, t-tye Differential Revision: https://reviews.llvm.org/D104827	2021-06-28 21:20:04 +00:00
Melanie Blower	931e95687d	[llvm][clang][fpenv] Create new intrinsic llvm.arith.fence to control FP optimization at expression level This intrinsic blocks floating point transformations by the optimizer. Author: Pengfei Reviewed By: LuoYuanke, Andy Kaylor, Craig Topper, kpn Differential Revision: https://reviews.llvm.org/D99675	2021-06-28 12:26:52 -04:00
Sander de Smalen	0e09d18c6a	Reland [GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize. This patch relands https://reviews.llvm.org/D104454, but fixes some failing builds on Mac OS which apparently has a different definition for size_t, that caused 'ambiguous operator overload' for the implicit conversion of TypeSize to a scalar value. This reverts commit `b732e6c9a8`.	2021-06-28 15:24:27 +01:00
Ahsan Saghir	31ef15e044	Teach peephole optimizer to not emit sub-register defs Peephole optimizer should not be introducing sub-reg definitions as they are illegal in machine SSA phase. This patch modifies the optimizer to not emit sub-register definitions. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103408	2021-06-28 09:24:07 -05:00
Brendon Cahoon	f9f5d41545	[AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX Adds legalizer, register bank select, and instruction select support for G_SBFX and G_UBFX. These opcodes generate scalar or vector ALU bitfield extract instructions for AMDGPU. The instructions allow both constant or register values for the offset and width operands. The 32-bit scalar version is expanded to a sequence that combines the offset and width into a single register. There are no 64-bit vgpr bitfield extract instructions, so the operations are expanded to a sequence of instructions that implement the operation. If the width is a constant, then the 32-bit bitfield extract instructions are used. Moved the AArch64 specific code for creating G_SBFX to CombinerHelper.cpp so that it can be used by other targets. Only bitfield extracts with constant offset and width values are handled currently. Differential Revision: https://reviews.llvm.org/D100149	2021-06-28 09:06:44 -04:00
David Blaikie	1b112c80a6	PR37255: DebugInfo: LTO with -g inlined into -gmlt combined with Split DWARF without CU cross-references A combination of features ^ that lead to a mismatch of expectations about how a subprogram definition DIE would be produced with/without a declaration when taking full -g debug info and inlining it into a -gmlt CU - specifically when using Split DWARF that doesn't support cross-CU references, so we have to put the -g debug info into the -gmlt CU, which gets confusing about which mode is respected. This patch comes down on respecting the CU the debug info is emitted into, rather than preserving the full debug info when it's emitted into the gmlt CU.	2021-06-27 14:40:38 -07:00
David Green	2887f14639	[ISel] Port AArch64 SABD and UABD to DAGCombine This ports the AArch64 SABD and USBD over to DAG Combine, where they can be used by more backends (notably MVE in a follow-up patch). The matching code has changed very little, just to handle legal operations and types differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing a ubds/abdu which is zexted to the original type. Differential Revision: https://reviews.llvm.org/D91937	2021-06-26 19:34:16 +01:00
David Green	b8c8bb0769	[DAG] Fold neg(splat(neg(x)) -> splat(x) This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can come up in the lowering of right shifts under AArch64, where we generate a shift left of a negated number. Differential Revision: https://reviews.llvm.org/D103755	2021-06-25 19:53:29 +01:00
Hendrik Greving	e15e1417b9	[ModuloSchedule] Pass loop block explicitly to kernel rewriter. This change is NFC upstream. We pass in the loop's block to the kernel rewriter explicitly, instead of assuming it's the loop's top block. This change is made for downstream targets where this assumption doesn't hold. Differential Revision: https://reviews.llvm.org/D104811	2021-06-25 09:51:22 -07:00
Sander de Smalen	b732e6c9a8	Revert "[GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize." This patch seems to be causing build errors, reverting it for now. This reverts commit `aeab9d9570`.	2021-06-25 17:37:16 +01:00
Sander de Smalen	aeab9d9570	[GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize. To reflect that the size may be scalable, a TypeSize is returned instead of an unsigned. In places where the result is used, it currently relies on an implicit cast of TypeSize -> uint64_t, which asserts that the type is not scalable. This patch is NFC for fixed-width vectors. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104454	2021-06-25 17:06:50 +01:00
Sander de Smalen	c9acd2f32e	[GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104453	2021-06-25 15:54:00 +01:00
Sander de Smalen	968980ef08	[GlobalISel] NFC: Change LLT::scalarOrVector to take ElementCount. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104452	2021-06-25 11:26:16 +01:00
Martin Storsjö	42f74e8249	[llvm] Rename StringRef _lower() method calls to _insensitive() This is a mechanical change. This actually also renames the similarly named methods in the SmallString class, however these methods don't seem to be used outside of the llvm subproject, so this doesn't break building of the rest of the monorepo.	2021-06-25 00:22:01 +03:00
Craig Topper	03f9e04bc3	[TargetLowering][ARM] Don't alter opaque constants in TargetLowering::ShrinkDemandedConstant. We don't constant fold based on demanded bits elsewhere in SimplifyDemandedBits, so I don't think we should shrink them either. The affected ARM test changes because a constant become non-opaque and eventually enabled some constant folding. This no longer happens. I checked and InstCombine is able to simplify this test. I'm not sure exactly what it was trying to test. Reviewed By: lebedev.ri, dmgreen Differential Revision: https://reviews.llvm.org/D104832	2021-06-24 10:09:36 -07:00
Sander de Smalen	d5e14ba88c	[GlobalISel] NFC: Change LLT::vector to take ElementCount. This also adds new interfaces for the fixed- and scalable case: * LLT::fixed_vector * LLT::scalable_vector The strategy for migrating to the new interfaces was as follows: * If the new LLT is a (modified) clone of another LLT, taking the same number of elements, then use LLT::vector(OtherTy.getElementCount()) or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2) or operator. That is because there is no reason to specifically restrict the types to 'fixed_vector'. If the algorithm works on the number of elements (as unsigned), then just use fixed_vector. This will need to be fixed up in the future when modifying the algorithm to also work for scalable vectors, and will need then need additional tests to confirm the behaviour works the same for scalable vectors. * If the test used the '/Scalable=/true` flag of LLT::vector, then this is replaced by LLT::scalable_vector. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104451	2021-06-24 11:26:12 +01:00
Stephen Tozer	c72705678c	Partial Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This is a partial reapply of the original commit and the followup commit that were previously reverted; this reapply also includes a small fix for a potential source of non-determinism, but also has a small change to turn off variadic debug value salvaging, to ensure that any future revert/reapply steps to disable and renable this feature do not risk causing conflicts. Differential Revision: https://reviews.llvm.org/D91722 This reverts commit `386b66b2fc`.	2021-06-24 09:46:38 +01:00
Carl Ritson	6b0f98d442	[ValueTypes] Define MVTs for v3i64/v3f64 to complement v6i32/v6f32 Having type symmetry with these is somewhat necessary when implementing support for 192-bit values. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104621	2021-06-24 12:41:22 +09:00
modimo	a7b62699c8	[NFC] [DwarfEHPrepare] Add additional stats for EH Stats added: 1. NumCleanupLandingPadsUnreachable: how many cleanup landing pads were optimized as unreachable 1. NumCleanupLandingPadsRemaining: how many cleanup landing pads remain 1. NumNoUnwind: Number of functions with nounwind attribute 1. NumUnwind: Number of functions with unwind attribute DwarfEHPrepare is always run a single time as part of `TargetPassConfig::addISelPasses()` which makes it an ideal place near the end of the pipeline to record this information. Example output from clang built with exceptions cumulative during thinLTO backend (NumCleanupLandingPadsUnreachable was not incremented): "dwarfehprepare.NumCleanupLandingPadsRemaining": 123660, "dwarfehprepare.NumNoUnwind": 323836, "dwarfehprepare.NumUnwind": 472893, Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104161	2021-06-23 17:09:30 -07:00
Craig Topper	91319534ba	[CGP][RISCV] Teach CodeGenPrepare::optimizeSwitchInst to honor isSExtCheaperThanZExt. This optimization pre-promotes the input and constants for a switch instruction to a legal type so that all the generated compares share the same extend. Since RISCV prefers sext for i32 to i64 extends, we should honor that to use sext.w instead of a pair of shifts. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D104612	2021-06-23 15:38:11 -07:00
Xun Li	f09ec01f1f	[SjLj] Insert UnregisterFn before musttail call When inserting UnregisterFn, if there is a musttail call, we must insert before the call so that we don't break the musttail call contract. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104807	2021-06-23 15:33:55 -07:00
Xun Li	f8c84da23b	Revert "[SjLj] Insert UnregisterFn before musttail call" This reverts commit `f36703ada3`. Test failure: https://lab.llvm.org/buildbot#builders/104/builds/3450	2021-06-23 15:31:35 -07:00
Xun Li	f36703ada3	[SjLj] Insert UnregisterFn before musttail call When inserting UnregisterFn, if there is a musttail call, we must insert before the call so that we don't break the musttail call contract. Differential Revision: https://reviews.llvm.org/D104807	2021-06-23 14:29:46 -07:00
Jinsong Ji	c125af82a5	[DAGCombine] Check reassoc flags in aggressive fsub fusion The is from discussion in https://reviews.llvm.org/D104247#inline-993387 The contract and reassoc flags shouldn't imply each other . All the aggressive fsub fusion reassociate operations, we should guard them with reassoc flag check. Reviewed By: mcberg2017 Differential Revision: https://reviews.llvm.org/D104723	2021-06-23 13:59:40 +00:00
Jon Roelofs	493d6928fe	[Remarks] Make memsize remarks report as an analysis, not a missed opportunity. Differential revision: https://reviews.llvm.org/D104078	2021-06-22 18:22:47 -07:00
zhijian	bd240b3d77	[AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table. Summary: generate eh_info when vector registers are saved according to the traceback table. struct eh_info_t { unsigned version; /* EH info version 0 / #if defined(64BIT) char _pad[4]; / padding / #endif unsigned long lsda; / Pointer to Language Specific Data Area / unsigned long personality; / Pointer to the personality routine */ }; the value of lsda and personality is zero when the number of vector registers saved is large zero and there is not personality of the function Reviewers: Jason Liu Differential Revision: https://reviews.llvm.org/D103651	2021-06-22 13:01:31 -04:00
Fangrui Song	f53d791520	Improve the diagnostic of DiagnosticInfoResourceLimit (and warn-stack-size in particular) Before: `warning: stack size limit exceeded (888) in main` After: `warning: stack frame size (888) exceeds limit (100) in function 'main'` (the -Wframe-larger-than limit will be mentioned) Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D104667	2021-06-22 09:55:20 -07:00
Sander de Smalen	bd7f7e2eba	[GlobalISel] Add scalable property to LLT types. This patch aims to add the scalable property to LLT. The rest of the patch-series changes the interfaces to take/return ElementCount and TypeSize, which both have the ability to represent the scalable property. The changes are mostly mechanical and aim to be non-functional changes for fixed-width vectors. For scalable vectors some unit tests have been added, but no effort has been put into making any of the GlobalISel algorithms work with scalable vectors yet. That will be left as future work. The work is split into a series of 5 patches to make reviews easier. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D104450	2021-06-22 08:43:34 +01:00
Eli Friedman	74909e4b6e	Rename MachineMemOperand::getOrdering -> getSuccessOrdering. Since this method can apply to cmpxchg operations, make sure it's clear what value we're actually retrieving. This will help ensure we don't accidentally ignore the failure ordering of cmpxchg in the future. We could potentially introduce a getOrdering() method on AtomicSDNode that asserts the operation isn't cmpxchg, but not sure that's worthwhile. Differential Revision: https://reviews.llvm.org/D103338	2021-06-21 16:49:27 -07:00
Nick Desaulniers	8ace121305	[IR] convert warn-stack-size from module flag to fn attr Otherwise, this causes issues when building with LTO for object files that use different values. Link: https://github.com/ClangBuiltLinux/linux/issues/1395 Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D104342	2021-06-21 15:09:25 -07:00
Jinsong Ji	3996311ee1	[DAGCombine] reassoc flag shouldn't enable contract According to IR LangRef, the FMF flag: contract Allow floating-point contraction (e.g. fusing a multiply followed by an addition into a fused multiply-and-add). reassoc Allow reassociation transformations for floating-point instructions. This may dramatically change results in floating-point. My understanding is that these two flags shouldn't imply each other, as we might have a SDNode that can be reassociated with others, but not contractble. eg: We may want following fmul/fad/fsub to freely reassoc, but don't want fma being generated here. %F = fmul reassoc double %A, %B ; <double> [#uses=1] %G = fmul reassoc double %C, %D ; <double> [#uses=1] %H = fadd reassoc double %F, %G ; <double> [#uses=1] %I = fsub reassoc double %H, %E ; <double> [#uses=1] Before https://reviews.llvm.org/D45710, `reassoc` flag actually did not imply isContratable either. The current implementation also only check the flag in fadd node, ignoring fmul node, this patch update that as well. Reviewed By: spatel, qiucf Differential Revision: https://reviews.llvm.org/D104247	2021-06-21 21:15:43 +00:00
Hendrik Greving	96994427f2	RegisterCoalescer: Fix iterating through use operands. Fixes a minor bug when trying to iterate through use operands when updating debug use operands. Extends a test to include above. Differential Revision: https://reviews.llvm.org/D104576	2021-06-21 09:17:54 -07:00
Craig Topper	3a8c7060cc	[TypePromotion] Prune Intrinsic includes. NFC TypePromotion is meant to be a generic pass and doesn't reference any ARM intrinsics so it shouldn't include IntrinsicsARM.h. The other Intrinsic related headers appear to be unneeded as well.	2021-06-20 13:04:02 -07:00
Fangrui Song	59d90fe817	Simplify some typedef struct	2021-06-19 11:36:44 -07:00
Michael Liao	b9c05aff20	[MIRPrinter] Add machine metadata support. - Distinct metadata needs generating in the codegen to attach correct AAInfo on the loads/stores after lowering, merging, and other relevant transformations. - This patch adds 'MachhineModuleSlotTracker' to help assign slot numbers to these newly generated unnamed metadata nodes. - To help 'MachhineModuleSlotTracker' track machine metadata, the original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage', which provides basic interfaces to create/retrive metadata slots. In addition, once LLVM IR is processsed, additional hooks are also introduced to help collect machine metadata and assign them slot numbers. - Finally, if there is any such machine metadata, 'MIRPrinter' outputs an additional 'machineMetadataNodes' field containing all the definition of those nodes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103205	2021-06-19 12:48:08 -04:00
Hongtao Yu	bd52495518	[CSSPGO] Undoing the concept of dangling pseudo probe As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen. I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104477	2021-06-18 15:14:11 -07:00
Simon Pilgrim	7353beda4a	[DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector knownbits. NFCI. As noticed on D104472 we can use APInt::insertBits which will avoid a lot of temporary APInt creations	2021-06-18 14:59:01 +01:00
Jon Roelofs	a2ab765029	[GISel] Eliminate redundant bitmasking This was a GISel vs SDAG regression that showed up at -Os on arm64 in: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.test https://llvm.godbolt.org/z/aecjodsjG Differential revision: https://reviews.llvm.org/D103334	2021-06-17 12:53:00 -07:00
David Green	fda8b4714e	[InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass The Interleave Access pass will convert shuffle(binop(load, load)) to binop(shuffle(load), shuffle(load)), in order to create more interleaving load patterns (VLD2/3/4) that might have been messed up by instcombine. As shown in D104247 we were missing copying IR flags to the new instruction though, which should just be kept the same as the original instruction. Differential Revision: https://reviews.llvm.org/D104255	2021-06-17 09:53:33 +01:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Sushma Unnibhavi	2193347e72	[M68k][GloballSel] Adding initial GlobalISel infrastructure Wiring up GlobalISel for the M68k backend Differential Revision: https://reviews.llvm.org/D101819	2021-06-16 10:48:38 -06:00
David Spickett	e4ecd83fe9	[llvm][AArch64] Handle arrays of struct properly (from IR) This only applies to FastIsel. GlobalIsel seems to sidestep the issue. This fixes https://bugs.llvm.org/show_bug.cgi?id=46996 One of the things we do in llvm is decide if a type needs consecutive registers. Previously, we just checked if it was an array or not. (plus an SVE specific check that is not changing here) This causes some confusion when you arbitrary IR like: ``` %T1 = type { double, i1 }; define [ 1 x %T1 ] @foo() { entry: ret [ 1 x %T1 ] zeroinitializer } ``` We see it is an array so we call CC_AArch64_Custom_Block which bails out when it sees the i1, a type we don't want to put into a block. This leaves the location of the double in some kind of intermediate state and leads to odd codegen. Which then crashes the backend because it doesn't know how to implement what it's been asked for. You get this: ``` renamable $d0 = FMOVD0 $w0 = COPY killed renamable $d0 ``` Rather than this: ``` $d0 = FMOVD0 $w0 = COPY $wzr ``` The backend knows how to copy 64 bit to 64 bit registers, but not 64 to 32. It can certainly be taught how but the real issue seems to be us even trying to assign a register block in the first place. This change makes the logic of AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters a bit more in depth. If we find an array, also check that all the nested aggregates in that array have a single member type. Then CC_AArch64_Custom_Block's assumption of a type that looks like [ N x type ] will be valid and we get the expected codegen. New tests have been added to exercise these situations. Note that some of the output is not ABI compliant. The aim of this change is to simply handle these situations and not to make our processing of arbitrary IR ABI compliant. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104123	2021-06-16 13:56:01 +00:00
Dylan Fleming	dab05335a6	[SVE] Fix PromoteIntRes_TRUNCATE not to call getVectorNumElements Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D104115	2021-06-16 13:09:43 +01:00
Rong Xu	82a0bb1afc	[SampleFDO] Place the discriminator flag variable into the used list. We create flag variable "__llvm_fs_discriminator__" in the binary to indicate that FSAFDO hierarchical discriminators are used. This variable might be GC'ed by the linker since it is not explicitly reference. I initially added the var to the use list in pass MIRFSDiscriminator but it did not work. It turned out the used global list is collected in lowering (before MIR pass) and then emitted in the end of pass pipeline. Here I add the variable to the use list in IR level's AddDiscriminators pass. The machine level code is still keep in the case IR's AddDiscriminators is not invoked. If this is the case, this just use -Wl,--export-dynamic-symbol=__llvm_fs_discriminator__ to force the emit. Differential Revision: https://reviews.llvm.org/D103988	2021-06-15 21:51:04 -07:00
Rong Xu	95f9026c17	Revert "[SampleFDO] Using common linkage for the discriminator flag variable" This reverts commit `434fed5aff`. Post commit review suggested to use another implmenentation. Detailed can be found in the review.	2021-06-15 21:22:23 -07:00
Rong Xu	434fed5aff	[SampleFDO] Using common linkage for the discriminator flag variable We create flag variable "__llvm_fs_discriminator__" in the binary to indicate that FSAFDO hierarchical discriminators are used. This variable might be GC'ed by the linker since it is not explicitly reference. I initially added the var to the use list in pass MIRFSDiscriminator but it did not work. It turned out the used global list is collected in lowering (before MIR pass) and then emitted in the end of pass pipeline. In this patch, we use a "common" linkage for this variable so that it will be GC'ed by the linker. Differential Revision: https://reviews.llvm.org/D103988	2021-06-15 14:51:27 -07:00
Roman Lebedev	585e65d330	[TLI] SimplifyDemandedVectorElts(): handle SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(?, 0)) Iff we have `SCALAR_TO_VECTOR` (and we demand it's only defined 0'th element), and said scalar was produced by `EXTRACT_VECTOR_ELT` from the 0'th element of some vector, then we can just continue traversal into said source vector. This comes up in X86 vector uniform shift lowering. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104250	2021-06-14 23:52:53 +03:00
Saleem Abdulrasool	5b5833b9e0	SelectionDAG: repair the Windows build `6e5628354e` regressed the Windows build as the return type no longer matched in both branches for the return value type deduction. This uses a bit more compiler magic to deal with that.	2021-06-14 08:25:36 -07:00
zhijian	7ed515d168	[AIX][XCOFF] emit vector info of traceback table. Summary: emit vector info of traceback table. Reviewers: Jason Liu,Hubert Tong Differential Revision: https://reviews.llvm.org/D93659	2021-06-14 11:15:22 -04:00
Roman Lebedev	0f94c3c80d	[NFC][DAGCombine] Extract getFirstIndexOf() lambda back into a function Not all supported compilers like such lambdas, at least one buildbot is unhappy.	2021-06-14 16:25:59 +03:00
Roman Lebedev	6e5628354e	[DAGCombine] reduceBuildVecToShuffle(): sort input vectors by decreasing size The sorting, obviously, must be stable, else we will have random assembly fluctuations. Apparently there was no test coverage that would benefit from that, so i've added one test. The sorting consists of two parts - just sort the input vectors, and recompute the shuffle mask -> input vector mapping. I don't believe we need to do anything else. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104187	2021-06-14 16:18:37 +03:00
Jeroen Dobbelaere	bb8ce25e88	Intrinsic::getName: require a Module argument Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary. This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288 (as mentioned by @fhahn), after committing D91250. Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99173	2021-06-14 14:52:29 +02:00
RamNalamothu	167e7afcd5	Implement DW_CFA_LLVM_* for Heterogeneous Debugging Add support in MC/MIR for writing/parsing, and DebugInfo. This is part of the Extensions for Heterogeneous Debugging defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html Specifically the CFI instructions implemented here are defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D76877	2021-06-14 08:51:50 +05:30
Simon Pilgrim	2c4ee1e112	RegUsageInfoPropagate.cpp - remove unused <string> and <map> includes. NFCI.	2021-06-13 15:19:24 +01:00
Florian Hahn	5cd66420cc	Revert "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB" This reverts commit `1b748faf2b` because it breaks building the llvm-test-suite with -verify-machineinstrs on X86: http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/9585/ Running llc -verify-machineinstr on X86 crashes on the IR below: target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" %struct.widget = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [16 x [16 x i16]], [6 x [32 x i32]], [16 x [16 x i32]], [4 x [12 x [4 x [4 x i32]]]], [16 x i32], i8, i32, i32*, i32, i32, i32, i32, i32, %struct.baz, %struct.wobble.1, i32, i32, i32, i32, i32, i32, %struct.quux.2, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32**, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [3 x [2 x i32]], i32, i32, i64, i64, %struct.zot.3, %struct.zot.3, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } %struct.baz = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.snork, %struct.wombat.0, %struct.wobble, i32, i32, i32, i32, i32, i32, i32, i32, i32 (%struct.widget, %struct.eggs), i32, i32, i32, i32 } %struct.snork = type { %struct.spam, %struct.zot, i32 (%struct.wombat, %struct.widget, %struct.snork) } %struct.spam = type { i32, i32, i32, i32, i8, i32 } %struct.zot = type { i32, i32, i32, i32, i32, i8, i32* } %struct.wombat = type { i32, i32, i32, i32, i32, i32, i32, i32, void (i32, i32, i32, i32), void (%struct.wombat, %struct.widget, %struct.zot)* } %struct.wombat.0 = type { [4 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] } %struct.quux = type { i16, i8 } %struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] } %struct.eggs = type { [1000 x i8], [1000 x i8], [1000 x i8], i32, i32, i32, i32, i32, i32, i32, i32 } %struct.wobble.1 = type { i32, [2 x i32], i32, i32, %struct.wobble.1, %struct.wobble.1, i32, [2 x [4 x [4 x [2 x i32]]]], i32, i64, i64, i32, i32, [4 x i8], [4 x i8], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } %struct.quux.2 = type { i32, i32, i32, i32, i32, %struct.quux.2* } %struct.zot.3 = type { i64, i16, i16, i16 } define void @blam(%struct.widget* %arg, i32 %arg1) local_unnamed_addr { bb: %tmp = load i32, i32* undef, align 4 %tmp2 = sdiv i32 %tmp, 6 %tmp3 = sdiv i32 undef, 6 %tmp4 = load i32, i32* undef, align 4 %tmp5 = icmp eq i32 %tmp4, 4 %tmp6 = select i1 %tmp5, i32 %tmp3, i32 %tmp2 %tmp7 = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* undef, i64 0, i64 0, i64 0 %tmp8 = zext i16 undef to i32 %tmp9 = zext i16 undef to i32 %tmp10 = load i16, i16* undef, align 2 %tmp11 = zext i16 %tmp10 to i32 %tmp12 = zext i16 undef to i32 %tmp13 = zext i16 undef to i32 %tmp14 = zext i16 undef to i32 %tmp15 = load i16, i16* undef, align 2 %tmp16 = zext i16 %tmp15 to i32 %tmp17 = zext i16 undef to i32 %tmp18 = sub nsw i32 %tmp8, %tmp9 %tmp19 = shl nsw i32 undef, 1 %tmp20 = add nsw i32 %tmp19, %tmp18 %tmp21 = sub nsw i32 %tmp11, %tmp12 %tmp22 = shl nsw i32 undef, 1 %tmp23 = add nsw i32 %tmp22, %tmp21 %tmp24 = sub nsw i32 %tmp13, %tmp14 %tmp25 = shl nsw i32 undef, 1 %tmp26 = add nsw i32 %tmp25, %tmp24 %tmp27 = sub nsw i32 %tmp16, %tmp17 %tmp28 = shl nsw i32 undef, 1 %tmp29 = add nsw i32 %tmp28, %tmp27 %tmp30 = sub nsw i32 %tmp20, %tmp29 %tmp31 = sub nsw i32 %tmp23, %tmp26 %tmp32 = shl nsw i32 %tmp30, 1 %tmp33 = add nsw i32 %tmp32, %tmp31 store i32 %tmp33, i32* undef, align 4 %tmp34 = mul nsw i32 %tmp31, -2 %tmp35 = add nsw i32 %tmp34, %tmp30 store i32 %tmp35, i32* undef, align 4 %tmp36 = select i1 %tmp5, i32 undef, i32 undef br label %bb37 bb37: ; preds = %bb %tmp38 = load i32, i32* undef, align 4 %tmp39 = ashr i32 %tmp38, %tmp6 %tmp40 = load i32, i32* undef, align 4 %tmp41 = sdiv i32 %tmp39, %tmp40 store i32 %tmp41, i32* undef, align 4 ret void }	2021-06-12 11:41:38 +01:00
Arthur Eubanks	c0c5a98b2c	[NFC][OpaquePtr] Explicitly pass GEP source type in optimizeGatherScatterInst() Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103480	2021-06-11 11:49:59 -07:00
Matt Arsenault	9d7299b6f0	GlobalISel: Reduce indentation and remove dead path	2021-06-11 13:45:24 -04:00
Matt Arsenault	93f3c7cc3e	CodeGen: Fix missing const	2021-06-11 13:45:24 -04:00
Tomas Matheson	773771ba38	[CodeGen][regalloc] Don't align stack slots if the stack can't be realigned Register allocation may spill virtual registers to the stack, which can increase alignment requirements of the stack frame. If the the function did not require stack realignment before register allocation, the registers required to do so may not be reserved/available. This results in a stack frame that requires realignment but can not be realigned. Instead, only increase the alignment of the stack if we are still able to realign. The register SpillAlignment will be ignored if we can't realign, and the backend will be responsible for emitting the correct unaligned loads and stores. This seems to be the assumed behaviour already, e.g. ARMBaseInstrInfo::storeRegToStackSlot and X86InstrInfo::storeRegToStackSlot are both `canRealignStack` aware. Differential Revision: https://reviews.llvm.org/D103602	2021-06-11 16:49:12 +01:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Carl Ritson	2c2d2922a2	[ValueTypes] Define MVTs for v6i32, v6f32, v7i32, v7f32 For use in AMDGPU selection DAG. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103881	2021-06-11 08:58:16 +09:00
Carl Ritson	cfbb92441f	[SDAG] Fix pow2 assumption when splitting vectors When reducing vector builds to shuffles it possible that the DAG combiner may try to extract invalid subvectors. This happens as the existing code assumes vectors will be power of 2 sizes, which is already untrue, but becomes more noticable with v6 and v7 types. Specifically the existing code assumes that half PowerOf2Ceil of a given vector index will fit twice into a given vector. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103880	2021-06-11 08:58:16 +09:00
Wolfgang Pieb	5a1589fc6d	[static initializers] Emit global_ctors and global_dtors in reverse order when .ctors/.dtors are used. Reviewed By: rnk, MaskRay, efriedma Differential Revision: https://reviews.llvm.org/D103495	2021-06-10 16:44:47 -07:00
Nick Desaulniers	fc018ebb60	[IR] make -warn-frame-size into a module attr -Wframe-larger-than= is an interesting warning; we can't know the frame size until PrologueEpilogueInsertion (PEI); very late in the compilation pipeline. -Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO and needed to be re-specified via -plugin-opt. Instead, make it part of the IR proper as a module level attribute, similar to D103048. Introduce -fwarn-stack-size CC1 option. Reviewed By: rsmith, qcolombet Differential Revision: https://reviews.llvm.org/D103928	2021-06-10 16:15:27 -07:00
David Spickett	64de8763aa	Revert "Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `31859f896c`. Causing SVE and RISCV-V test failures on bots.	2021-06-10 10:11:17 +00:00
Paulo Matos	31859f896c	Implementation of global.get/set for reftypes in LLVM IR This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D95425	2021-06-10 10:07:45 +02:00
Jinsong Ji	4a89ed373c	[AIX] Add traceback ssp canary bit support We will need to set the ssp canary bit in traceback table to communicate with unwinder about the canary. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D103202	2021-06-10 02:40:02 +00:00
Sanjay Patel	dd763ac791	[SDAG] fix miscompile from merging stores of different sizes As shown in: https://llvm.org/PR50623 ...and the similar tests here, we were not accounting for store merging of different sizes that do not cover the entire range of the wide value to be stored. This is the easy fix: just make sure that all of the original stores are the same size, so when we calculate the wide width, it's a simple N * M check. This still allows all of the motivating optimizations from: D86420 / `54a5dd485c` D87112 / `7a06b166b1` We could enhance this code to track individual bytes and allow merging multiple sizes.	2021-06-09 09:51:39 -04:00
Fraser Cormack	502edebd9d	[ValueTypes][RISCV] Cap RVV fixed-length vectors by size This patch changes RVV's policy for its supported list of fixed-length vector types by capping by vector size rather than element count. Now all 1024-byte vectors (of supported element types) are supported, rather than all 256-element vectors. This is a more natural fit for the architecture, and allows us to, for example, improve the support for vector bitcasts. This change necessitated the adding of some new simple types to avoid "regressing" on the number of currently-supported vectors. We round out the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and `v512f16`. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103884	2021-06-09 12:15:37 +01:00
Matt Arsenault	31a9659de5	GlobalISel: Avoid use of G_INSERT in insertParts G_INSERT legalization is incomplete and doesn't work very well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES padding with undef values (although this can get pretty large). For the case of load/store narrowing, this is still performing the load/stores in irregularly sized pieces. It might be cleaner to split this down into equal sized pieces, and rely on load/store merging to optimize it.	2021-06-08 14:44:24 -04:00
Matt Arsenault	2927d40f04	GlobalISel: Hide virtual register creation in MIRBuilder	2021-06-08 14:44:24 -04:00
Nick Desaulniers	3787ee4571	reland [IR] make -stack-alignment= into a module attr Relands commit `433c8d950c` with fixes for MIPS. Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 10:59:46 -07:00
Justin Bogner	4271e1d2c5	[GlobalISel] Handle non-multiples of the base type in narrowScalarAddSub When narrowing G_ADD and G_SUB, handle types that aren't a multiple of the type we're narrowing to. This allows us to handle types like s96 on 64 bit targets. Note that the test here has a couple of dead instructions because of the way the setup legalizes. I wasn't able to come up with a way to write this test that avoids that easily. Differential Revision: https://reviews.llvm.org/D97811	2021-06-08 10:13:38 -07:00
Justin Bogner	2a7e759734	[GlobalISel] Handle non-multiples of the base type in narrowScalarInsert When narrowing G_INSERT, handle types that aren't a multiple of the type we're narrowing to. This comes up if we're narrowing something like an s96 to fit in 64 bit registers and also for non-byte multiple packed types if they come up. This implementation handles these cases by extending the extra bits to the narrow size and truncating the result back to the destination size. Differential Revision: https://reviews.llvm.org/D97791	2021-06-08 10:13:38 -07:00
Simon Pilgrim	114e712c34	InstrEmitter.cpp - don't dereference a dyn_cast<>. dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.	2021-06-08 17:59:04 +01:00
Nick Desaulniers	a596b54d47	Revert "[IR] make -stack-alignment= into a module attr" This reverts commit `433c8d950c`. Breaks the MIPS build.	2021-06-08 08:55:50 -07:00
Nick Desaulniers	433c8d950c	[IR] make -stack-alignment= into a module attr Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 08:31:04 -07:00
Simon Pilgrim	61a2d6bfe4	[DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609) shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y)) If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value. Fixes PR50609	2021-06-08 15:49:41 +01:00
Hans Wennborg	386b66b2fc	Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" > This reapplies `c0f3dfb9`, which was reverted following the discovery of > crashes on linux kernel and chromium builds - these issues have since > been fixed, allowing this patch to re-land. This reverts commit `36ec97f76a`. The change caused non-determinism in the compiler, see comments on the code review at https://reviews.llvm.org/D91722. Reverting to unbreak people's builds until that can be addressed. This also reverts the follow-up "[DebugInfo] Limit the number of values that may be referenced by a dbg.value" in `a0bd6105d8`.	2021-06-08 14:54:08 +02:00
Kerry McLaughlin	5db52751a5	[CostModel] Return an invalid cost for memory ops with unsupported types Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector type cannot be legalized by widening/splitting. When this is the method of legalization found, getTypeLegalizationCost will return an Invalid cost. The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call getTypeLegalizationCost and will now also return an Invalid cost for unsupported types. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D102515	2021-06-08 12:07:36 +01:00
David Green	b889c6ee99	[DAG] Allow isNullOrNullSplat to see truncated zeroes This sets the AllowTruncation flag on isConstOrConstSplat in isNullOrNullSplat, allowing it to see truncated constant zeroes on architectures such as AArch64, where only a i32.i64 are legal. As a truncation of 0 is always 0, this should always be valid, allowing some extra folding to happen including some of the cases from D103755. Differential Revision: https://reviews.llvm.org/D103756	2021-06-08 10:18:58 +01:00
Arthur Eubanks	47211fa889	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" Needs to be discussed more. This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46 This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1 This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23 This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd	2021-06-07 16:07:44 -07:00
Matt Arsenault	dc98adfb44	GlobalISel: Use MMO helper for getting the size in bits	2021-06-07 14:26:48 -04:00
Matt Arsenault	f6555b917b	GlobalISel: Remove unnecessary .getReg(0)s	2021-06-07 14:26:48 -04:00
Guillaume Chatelet	1da2c7d25c	[NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE Differential Revision: https://reviews.llvm.org/D103251	2021-06-07 10:04:16 +00:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Nikita Popov	506875c879	[TargetLowering] Move methods out of line (NFC) Move methods using IRBuilder out of line, so we can drop the dependency on the header.	2021-06-06 16:02:10 +02:00
Nikita Popov	9914200393	[CodeGen] Add missing includes (NFC) These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.	2021-06-06 15:48:27 +02:00
Mirko Brkusanin	35ef4c940b	[AMDGPU][GlobalISel] Legalize G_ABS Legalize and select G_ABS so that we can use llvm.abs intrinsic Differential Revision: https://reviews.llvm.org/D102391	2021-06-04 14:46:43 +02:00
Jeremy Morse	4501928eb2	Re-land `ae4303b42c`, "Track PHI values through register coalescing" Was reverted in `0507fc2ffc`, in phi-coalesce-subreg.mir I'd explicitly named some passes to run instead of specifying a range. As a result some two-address-instrs weren't correctly rewritten and the verifier got upset. Original commit message: [DebugInstrRef][2/3] Track PHI values through register coalescing In the instruction referencing variable location model, we store variable locations that point at PHIs in MachineFunction during register allocation. Unfortunately, register coalescing can substantially change the locations of registers, and so that PHI-variable-location side table needs maintenence during the pass. This patch builds an index from the side table, and whenever a vreg gets coalesced into another vreg, update the index to record the new vreg that the PHI happens in. It also accepts a limited range of subregister coalescing, for example merging a subregister into a larger class. Differential Revision: https://reviews.llvm.org/D86813	2021-06-04 11:32:02 +01:00
Fraser Cormack	aec9cbbeb8	[SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR This patch extends the SelectionDAG's ability to constant-fold vector arithmetic to include support for SPLAT_VECTOR. This is not only for scalable-vector types but also for fixed-length vector types, which helps Hexagon in a couple of cases. The original RISC-V test case was in fact an infinite DAGCombine loop. The pattern `and (truncate v1), (truncate v2)` can be combined to `truncate (and v1, v2)` but the truncate can similarly be combined back to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or `v2` is a constant vector). It wasn't exposed in on fixed-length types because a TRUNCATE of a constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D103246	2021-06-04 09:53:15 +01:00
Esme-Yi	fbfd717197	[Debug-Info] handle DW_CC_pass_by_value/DW_CC_pass_by_reference under strict DWARF. Summary: When -strict-dwarf=true is specified, the calling convention info DW_CC_pass_by_value or DW_CC_pass_by_reference can only be generated at DWARF5. Reviewed By: shchenz, dblaikie Differential Revision: https://reviews.llvm.org/D103300	2021-06-04 08:14:47 +00:00
Arthur Eubanks	9255a5c1ba	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. Issues can be diagnosed with D103412. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-06-03 15:52:01 -07:00
Brendon Cahoon	53ab2d821e	[GlobalISel] Add G_SBFX/G_UBFX to computeKnownBits Differential Revision: https://reviews.llvm.org/D102969	2021-06-03 16:01:47 -04:00
Eli Friedman	44cdf771fe	[AtomicExpand] Merge cmpxchg success and failure ordering when appropriate. If we're not emitting separate fences for the success/failure cases, we need to pass the merged ordering to the target so it can emit the correct instructions. For the PowerPC testcase, we end up with extra fences, but that seems like an improvement over missing fences. If someone wants to improve that, the PowerPC backed could be taught to emit the fences after isel, instead of depending on fences emitted by AtomicExpand. Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 . Differential Revision: https://reviews.llvm.org/D103342	2021-06-03 11:34:35 -07:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
Jeremy Morse	0507fc2ffc	Revert "[DebugInstrRef][2/3] Track PHI values through register coalescing" This reverts commit `ae4303b42c`. Expensive checks buildbot has found a problem with this: https://lab.llvm.org/buildbot/#/builders/16/builds/11863	2021-06-03 17:16:58 +01:00
Jeremy Morse	ae4303b42c	[DebugInstrRef][2/3] Track PHI values through register coalescing In the instruction referencing variable location model, we store variable locations that point at PHIs in MachineFunction during register allocation. Unfortunately, register coalescing can substantially change the locations of registers, and so that PHI-variable-location side table needs maintenence during the pass. This patch builds an index from the side table, and whenever a vreg gets coalesced into another vreg, update the index to record the new vreg that the PHI happens in. It also accepts a limited range of subregister coalescing, for example merging a subregister into a larger class. Differential Revision: https://reviews.llvm.org/D86813	2021-06-03 17:06:51 +01:00
Fraser Cormack	1de1887f5f	[CodeGen] Fix a scalable-vector crash in VSELECT legalization The `DAGTypeLegalizer::WidenVSELECTMask` function is not (yet) ready for scalable vector types, and has numerous places in which it tries to grab either the fixed size or number of elements of its types. I believe that it should be possible to update this method to properly account for scalable-vector types, but we don't have test cases for that; RISC-V bails out early on as it has legal i1 vector masks. As such, this patch just prevents it from crashing. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103536	2021-06-03 10:24:55 +01:00
Fraser Cormack	2dd20a31f2	[ValueTypes] Fix scalable-vector changeExtendedVectorTypeToInteger The attached tests check for the regression in DAGCombiner's `visitVSELECT`, which may call this method. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103534	2021-06-03 09:36:56 +01:00
Sanjay Patel	0718ac706d	[SDAG] allow cast folding for vector sext-of-setcc with signed compare This extends `434c8e013a` and `ede3982792` to handle signed predicates by sign-extending the setcc operands. This is not shown directly in https://llvm.org/PR50055 , but the pattern is visible by changing the unsigned convert to signed in the source code.	2021-06-02 15:05:02 -04:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Sanjay Patel	ede3982792	[SDAG] allow more cast folding for vector sext-of-setcc This is a follow-up to D103280 that eases the use restrictions, so we can handle the motivating case from: https://llvm.org/PR50055 The loop code is adapted from similar use checks in ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an easier way to filter out non-chain uses of load values. Differential Revision: https://reviews.llvm.org/D103462	2021-06-02 13:14:49 -04:00

... 2 3 4 5 6 ...

31069 Commits