llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	c662795b07	[AsmPrinter][ELF] Emit local alias for ExternalLinkage dso_local GlobalAlias	2020-02-12 17:08:22 -08:00
Guozhi Wei	369d086d78	[MBP] Partial tail duplication into hot predecessors Current tail duplication embedded in MBP duplicates a BB into all or none of its predecessors without too much cost analysis. So sometimes it is duplicated into cold predecessors, and in other cases it may miss the duplication into hot predecessors. This patch improves tail duplication in 3 aspects: A successor can be duplicated into part of its predecessors. A more fine-grained benefit analysis, combined with 1, now a successor is duplicated into hot predecessors only. If a successor can't be duplicated into one predecessor, it doesn't impact the duplication into other predecessors. Differential Revision: https://reviews.llvm.org/D73387	2020-02-12 15:22:33 -08:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Simon Pilgrim	9eb426c88c	[TargetLowering] Add NegatibleCost enum for isNegatibleForFree return codes The isNegatibleForFree/getNegatedExpression methods currently rely on a raw char value to indicate whether a negation is beneficial or not. This patch replaces the char return value with an NegatibleCost enum to more clearly demonstrate what is implied. It also renames isNegatibleForFree to getNegatibleCost to more accurately reflect whats going on. Differential Revision: https://reviews.llvm.org/D74221	2020-02-12 11:51:42 +00:00
Djordje Todorovic	97ed706a96	Revert "[DebugInfo] Enable the debug entry values feature by default" This reverts commit rG9f6ff07f8a39. Found a test failure on clang-with-thin-lto-ubuntu buildbot.	2020-02-12 11:59:04 +01:00
Clement Courbet	15488ff24b	[CodeGen] Fix the computation of the alignment of split stores. Summary: Right now the alignment of the lower half of a store is computed as align/2, which fails for unaligned stores (align = 1), and is overly pessimitic for, e.g. a 8 byte store aligned to 4 bytes. Fixes PR44851 Fixes PR44877 Reviewers: gchatelet, spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74311	2020-02-12 10:37:30 +01:00
Djordje Todorovic	9f6ff07f8a	[DebugInfo] Enable the debug entry values feature by default This patch enables the debug entry values feature. - Remove the (CC1) experimental -femit-debug-entry-values option - Enable it for x86, arm and aarch64 targets - Resolve the test failures - Leave the llc experimental option for targets that do not support the CallSiteInfo yet Differential Revision: https://reviews.llvm.org/D73534	2020-02-12 10:25:14 +01:00
Nicolai Hähnle	07a5b849f7	SelectionDAG: Fix bug in ClusterNeighboringLoads Summary: The method attempts to find loads that can be legally clustered by looking for loads consuming the same chain glue token. However, the old code looks at _all_ users of values produced by the chain node -- including uses of the loaded/returned value of volatile loads or atomics. This could lead to circular dependencies which then failed during scheduling. With this change, we filter out users by getResNo, i.e. by which SDValue value they use, to ensure that we only look at users of the chain glue token. This appears to be a rather old bug, which is perhaps surprising. However, the test case is actually quite fragile (i.e., it is hidden by fairly small changes), and the test _must_ use volatile loads for the bug to manifest. Reviewers: arsenm, bogner, craig.topper, foad Subscribers: MatzeB, jvesely, wdng, hiraditya, javed.absar, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74253	2020-02-12 09:12:55 +01:00
Craig Topper	0daf9b8e41	[X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses them to implement soft promotion for the half type. This is enough to provide basic support for __fp16 with strictfp. Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C is enabled.	2020-02-11 22:30:04 -08:00
lewis-revill	a6bd1256ce	[DebugInfo] Call site entries cannot be generated for FrameSetup calls Instructions marked as FrameSetup do not cause requestLabelAfterInsn to be called and so no such label is generated. Call instructions which require call site entries to be generated require this label to be present in order to calculate the return PC offset/address, but the check for whether the call instruction is marked as FrameSetup was not present. Therefore in the case where a call instruction is marked as FrameSetup, an assertion failure occurs if a call site entry is to be generated. This is the case with RISC-V's implementation of save/restore via library calls. Differential Revision: https://reviews.llvm.org/D71593	2020-02-11 21:23:18 +00:00
OCHyams	35e0ab647b	[DebugInfo][NFC] Fixup the UserValue methods to use FragmentInfo Fixup the UserValue methods to use FragmentInfo instead of DIExpression because the DIExpression is only ever used to get the to get the FragmentInfo. The DIExpression is meaningless in the UserValue class because each definition point added to a UserValue may have a unique DIExpression. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74057	2020-02-11 10:20:24 +00:00
OCHyams	3aa33fde03	[DebugInfo][NFC] Rename the class DbgValueLocation to DbgVariableValue Rename the class DbgValueLocation to DbgVariableValue and instances from Loc to DbgValue. These names better express the new semantics introduced in D74053. The class previously represented a { Location } only. It now represents a { Location, DIExpression } pair which together describe a value. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74055	2020-02-11 10:20:24 +00:00
OCHyams	1e40799324	[DebugInfo] Teach LDV how to handle identical variable fragments LiveDebugVariables uses interval maps to explicitly represent DBG_VALUE intervals. DBG_VALUEs are filtered into an interval map based on their { Variable, DIExpression }. The interval map will coalesce adjacent entries that use the same { Location }. Under this model, DBG_VALUEs which refer to the same bits of the same variable will be filtered into different interval maps if they have different DIExpressions which means the original intervals will not be properly preserved. This patch fixes the problem by using { Variable, Fragment } to filter the DBG_VALUEs into maps, and coalesces adjacent entries iff they have the same { Location, DIExpression } pair. The solution is not perfect because we see the similar issues appear when partially overlapping fragments are encountered, but is far simpler than a complete solution (i.e. D70121). Fixes: pr41992, pr43957 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D74053	2020-02-11 10:20:24 +00:00
Amara Emerson	067dd9c6b1	[GlobalISel][CallLowering] Use stripPointerCasts(). A downstream test exposed a simple logic bug with the manual pointer stripping code, fix that by just using stripPointerCasts() on the value. I don't think there's a way to expose this issue upstream.	2020-02-10 15:43:57 -08:00
Matt Arsenault	f270da6bfc	RegisterCoalescer: Add LaneMask to debug printing	2020-02-10 12:34:33 -08:00
diggerlin	aa86311e62	[AIX][XCOFF] Support Mergeable2ByteCString and Mergeable4ByteCString SUMMARY: The patch is enable to support Mergeable2ByteCString and Mergeable4ByteCString Reviewers: daltenty Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D74164	2020-02-10 14:45:54 -05:00
Sebastian Neubauer	7cddd15e56	[SelectionDAG] Optimize build_vector of truncates and shifts Add a simplification to fuse a manual vector extract with shifts and truncate into a bitcast. Unpacking and packing values into vectors is only optimized with extractelement instructions, not when manually unpacked using shifts and truncates. This patch simplifies shifts and truncates into a bitcast if possible. Simplify (build_vec (trunc $1) (trunc (srl $1 width)) (trunc (srl $1 (2 * width))) ...) to (bitcast $1) Differential Revision: https://reviews.llvm.org/D73892	2020-02-10 15:04:07 +01:00
Djordje Todorovic	3a4dc577c9	[CSInfo] Fix the assertions regarding updating the CSInfo The call site info was not updated correctly when deleting corresponding call instructions. Differential Revision: https://reviews.llvm.org/D73700	2020-02-10 10:55:06 +01:00
Djordje Todorovic	68908993eb	[CSInfo] Use isCandidateForCallSiteEntry() when updating the CSInfo Use the isCandidateForCallSiteEntry(). This should mostly be an NFC, but there are some parts ensuring the moveCallSiteInfo() and copyCallSiteInfo() operate with call site entry candidates (both Src and Dest should be the call site entry candidates). Differential Revision: https://reviews.llvm.org/D74122	2020-02-10 10:03:14 +01:00
Amara Emerson	21c9d9ad43	[GlobalISel][CallLowering] Tighten constantexpr check for callee. I'm not sure there's a test case for this, but it's better to be safe.	2020-02-09 22:59:48 -08:00
Matt Arsenault	312a9d1b83	GlobalISel: Fix narrowScalar for G_{CTLZ\|CTTZ}_ZERO_UNDEF Narrow these for 64-bit VALU for AMDGPU.	2020-02-09 19:02:38 -05:00
Matt Arsenault	6135f5eda4	GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ The result type is separate from the source type.	2020-02-09 18:11:43 -05:00
Craig Topper	eeb63944e4	[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results. Remove code from LegalizeTypes that allowed this to work. We were already using BUILD_PAIR for this in some places so this standardizes on a single way to do this.	2020-02-08 09:52:31 -08:00
Craig Topper	2af1640f9a	[LegalizeDAG][X86][AMDGPU] Use ANY_EXTEND instead of ZERO_EXTEND when promoting ISD::CTTZ/CTTZ_ZERO_UNDEF. Summary: For CTTZ we place a set bit just past where the non-promoted type stopped so the extended bits won't be used for the count. For CTTZ_ZERO_UNDEF we don't care what happens if no bits are set in the original type and we end up counting into the extended bits. So we can just use ANY_EXTEND for both cases. This matches what is done in type legalization for these operations. We make no effort to force the upper bits to zero. Differential Revision: https://reviews.llvm.org/D74111	2020-02-07 22:25:56 -08:00
Amara Emerson	35c63d66aa	[GlobalISel][CallLowering] Look through bitcasts from constant function pointers. Calls to ObjC's objc_msgSend function are done by bitcasting the function global to the required function type signature. This patch looks through this bitcast so that we can do a direct call with bl on arm64 instead of using an indirect blr. Differential Revision: https://reviews.llvm.org/D74241	2020-02-07 15:32:54 -08:00
Vedant Kumar	0d0ef315cb	[MachineInstr] Add isCandidateForCallSiteEntry predicate Add the isCandidateForCallSiteEntry predicate to MachineInstr to determine whether a DWARF call site entry should be created for an instruction. For now, it's enough to have any call instruction that doesn't belong to a blacklisted set of opcodes. For these opcodes, a call site entry isn't meaningful. Differential Revision: https://reviews.llvm.org/D74159	2020-02-07 10:10:41 -08:00
Petar Avramovic	7df5fc9e03	[GlobalISel] Add buildMerge with SrcOp initializer list Allows more flexible use of buildMerge in places where use operands are available as SrcOp since it does not require explicit conversion to Register. Simplify code with new buildMerge. Differential Revision: https://reviews.llvm.org/D74223	2020-02-07 18:43:45 +01:00
Amara Emerson	28d22c2c9c	[GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber. This is a one off special case, since actually implementing full inline asm support will be much more involved. This lets us compile a lot more code as a common simple case. Differential Revision: https://reviews.llvm.org/D74201	2020-02-07 08:55:23 -08:00
Jinsong Ji	01edae1271	[AsmPrinter] Print FP constant in hexadecimal form instead Printing floating point number in decimal is inconvenient for humans. Verbose asm output will print out floating point values in comments, it helps. But in lots of cases, users still need additional work to covert the decimal back to hex or binary to check the bit patterns, especially when there are small precision difference. Hexadecimal form is one of the supported form in LLVM IR, and easier for debugging. This patch try to print all FP constant in hex form instead. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D73566	2020-02-07 16:00:55 +00:00
Matt Arsenault	3b198518ad	GlobalISel: Fix narrowing of G_CTPOP The result type is separate from the source type. Tests will be included in a future AMDGPU patch which uses this from RegBankSelect/applyMappingImpl.	2020-02-07 06:58:00 -08:00
Matt Arsenault	8de2dad9e0	GlobalISel: Fix lowering of G_CTLZ/G_CTTZ The type passed to lower was invalid, so I'm not sure how this was even working before. The source and destination type also do not have to match, so make sure to use the right ones.	2020-02-07 06:54:12 -08:00
Guillaume Chatelet	f85d3408e6	[NFC] Introduce an API for MemOp Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73964	2020-02-07 11:32:27 +01:00
Sourabh Singh Tomar	84e5760a16	[DebugInfo]: Reorderd the emission of debug_str section. Summary: This patch reorders the emission of debug_str section, so that string can come after macros. This is necessary for macro forms like DW_MACRO_define_strp, which emits macro as a string in debug_str section.	2020-02-07 11:15:55 +05:30
Amara Emerson	ac8a12c874	[GlobalISel] Use G_ZEXTLOAD instead of an anyextending load for non-pow-2 legalization. Fixes PR43288	2020-02-06 14:36:36 -08:00
Konstantin Schwarz	76986bdc46	[GlobalISel] Legalize more G_FP(EXT\|TRUNC) libcalls. This adds a new helper function for retrieving the floating point type corresponding to the specified bit-width.	2020-02-06 11:41:34 -08:00
Fangrui Song	727362e87b	[MC][ELF] Rename MC related "Associated" to "LinkedToSym" "linked-to section" is used by the ELF spec. By analogy, "linked-to symbol" is a good name for the signature symbol. The word "linked-to" implies a directed edge and makes it clear its relation with "sh_link", while one can argue that "associated" means an undirected edge. Also, combine tests and add precise SMLoc to improve diagnostics. Reviewed By: eugenis, grimar, jhenderson Differential Revision: https://reviews.llvm.org/D74082	2020-02-06 11:31:04 -08:00
Jeremy Morse	6531a78ac4	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field" This reverts commit `ed29dbaafa`. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in `e39e2b4a79` that's technically unrelated. dbg-addr-dse.ll conflicted because `41206b61e3` (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ece761427f	Revert "[DebugInfo][DAG] Distinguish different kinds of location indirection" This reverts commit `3137fe4d23`. I'm backing out D68945, which this patch is a follow up for. It'll be re-landed when D68945 is fixed. The changes to dbg-value-func-arg.ll occur because our handling of certain kinds of location now mixes up indirection that happens at different points in a DIExpression. While this is a regression, it's a return to the prior behaviour while a better patch is sought.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ed5998d21e	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `2d3174c4df`. The overall solution for this problem is reverting D68945, which wasn't handling the -O0 path through the codegen backend correctly. See: discussion in D73526.	2020-02-06 14:41:39 +00:00
Sjoerd Meijer	93b0536fd2	[RDA] getInstFromId: find instructions. NFC. To find the instruction in the block for a given ID, first a count and then a lookup was performed in the map, which is almost the same thing, thus doing double the work. Differential Revision: https://reviews.llvm.org/D73866	2020-02-06 14:13:31 +00:00
Sam Parker	0a8cae10fe	[ReachingDefs] Make isSafeToMove more strict. Test that we're not moving the instruction through instructions with side-effects. Differential Revision: https://reviews.llvm.org/D74058	2020-02-06 14:06:08 +00:00
Diogo Sampaio	8ba2b62810	[ARM] Fix non-determenistic behaviour Summary: ARM Type Promotion pass does not clear the container that defines if one variable was visited or not, missing optimization opportunities by luck when two llvm:Values from different functions are allocated at the same memory address. Also fixes a comment and uses existing method to pop and obtain last element of the worklist. Reviewers: samparker Reviewed By: samparker Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73970	2020-02-06 09:21:13 +00:00
Matt Arsenault	7464e8d6ad	GlobalISel: Remove check for illegal MIR The verifier will catch this.	2020-02-05 18:37:17 -05:00
Jonas Paulsson	96ea377ea4	[PHIElimination] Compile time optimization for huge functions. This is a compile-time optimization for PHIElimination (splitting of critical edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As discussed there, the way to remedy the slowdowns with huge functions is to pre-compute the live-in registers for each MBB in an efficient way in PHIElimination.cpp and then pass that information along to LiveVariabless::addNewBlock(). In all the huge test programs where this slowdown has been noticable, it has dissapeared entirely with this patch. Review: Björn Pettersson, Quentin Colombet. Differential Revision: https://reviews.llvm.org/D73152	2020-02-05 18:10:03 -05:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
Shu-Chun Weng	ce9633633c	[GlobalISel][AArch64] Fix contract cross-bank copies with SIMD instructions contractCrossBankCopyIntoStore() finds the instruction defines the source register and uses its output to replace the register. There are, however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES. Current implementation hardcodes to operand 0 and has no way of knowing which output should be used. This change adds another function to directly return the register that is the source of the register and use that for folding. This fixes https://bugs.llvm.org/show_bug.cgi?id=44783 Differential Revision: https://reviews.llvm.org/D74005	2020-02-05 10:38:35 -08:00
Simon Pilgrim	4592bb7195	visitINSERT_VECTOR_ELT - pull out repeated dyn_cast. NFCI. This always gets called at least once.	2020-02-05 13:30:54 +00:00
Djordje Todorovic	de90d73e03	[DebugInfo] Avoid the call site param for mem instrs with multiple defs We currently only handle mem instructions with a single define. Avoid the call site parameter debug info when we find the case with multiple defs, rather than throwing an assert. Differential Revision: https://reviews.llvm.org/D73954	2020-02-05 10:03:14 +01:00
Thomas Lively	649aba93a2	Revert "[WebAssembly][InstrEmitter] Foundation for multivalue call lowering" Summary: This reverts commit `3ef169e586`. The purpose of this commit was to allow stack machines to perform instruction selection for instructions with variadic defs. However, MachineInstrs fundamentally cannot support variadic defs right now, so this change does not turn out to be useful. Depends on D73927. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73928	2020-02-04 20:04:59 -08:00
David Blaikie	ec50e10db4	DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF Originally committed in: `1ced28cbe7` Reverted in: `f75301d16d` (reverted due to tests failing on non-linux/x86 targets, tests have since been generalized and specialized... since Split DWARF isn't supported on non-elf targets anyway and we have no way to run on "whatever elf target is available" so they fail on MacOS without an explicit target triple) This code was incorrectly emitting extra bytes into arbitrary parts of the object file when it was meant to be hashing them to compute the DWO ID. Follow-up patch(es) will refactor this API somewhat to make such bugs harder to introduce, hopefully.	2020-02-04 19:25:47 -08:00
Francis Visoiu Mistrih	7531a5039f	[Remarks] Extend the RemarkStreamer to support other emitters This extends the RemarkStreamer to allow for other emitters (e.g. frontends, SIL, etc.) to emit remarks through a common interface. See changes in llvm/docs/Remarks.rst for motivation and design choices. Differential Revision: https://reviews.llvm.org/D73676	2020-02-04 17:16:02 -08:00
Reid Kleckner	2d89e0a098	[SEH] Remove CATCHPAD SDNode and X86::EH_RESTORE MachineInstr The CATCHPAD node mostly existed to be selected into the EH_RESTORE instruction, which sets the frame back up when 32-bit Windows exceptions return to the parent function. However, creating this MachineInstr early increases the risk that other passes will come along and insert instructions that use the stack before ESP and EBP are restored. That happened in PR44697. Instead of representing these in the instruction stream early, delay it until PEI. Mark the blocks where this needs to happen as EHPads, but not funclet entry blocks. Passes after PEI have to be careful not to hoist instructions that can use stack across frame setup instructions, so this should be relatively reliable. Fixes PR44697 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D73752	2020-02-04 15:13:12 -08:00
Matt Arsenault	23b76096b7	CodeGenPrepare: Reorder check for cold and shouldOptimizeForSize shouldOptimizeForSize is showing up in a profile, spending around 10% of the pass time in one function. This should probably not be so slow, but the much cheaper attribute check should be done first anyway.	2020-02-04 11:23:13 -08:00
Matt Arsenault	de8451fe4d	GlobalISel: Fold SmallVector resizes into constructors	2020-02-04 10:28:08 -08:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Nico Weber	f75301d16d	Revert "DebugInfo: Check DW_OP_convert in loclists with Split DWARF" and follow-ups. This reverts commit `1ced28cbe7`. This reverts commit `4f281f0474`. This reverts commit `552a8fe12b`. The test fails on non-Linux.	2020-02-04 10:06:46 -05:00
Jeremy Morse	41206b61e3	[DebugInfo] Re-instate LiveDebugVariables scope trimming This patch reverts part of r362750 / D62650, which stopped LiveDebugVariables from trimming leading variable location ranges down to only covering those instructions that are in scope. I've observed some circumstances where the number of DBG_VALUEs in a function can be amplified in an un-necessary way, to cover more instructions that are out of scope, leading to very slow compile times. Trimming the range of instructions that the variables cover solves the slow compile times. The specific problem that r362750 tries to fix is addressed by the assignment to RStart that I've added. Any variable location that begins at the first instruction of a block will now be considered to begin at the start of the block. While these sound the same, the have different SlotIndexes, and the register allocator may shoehorn additional instructions in between the two. The test added in the past (wrong_debug_loc_after_regalloc.ll) still works with this modification. live-debug-variables.ll has a range trimmed to not cover the prologue of the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one instruction with no DebugLoc, which is expected behaviour. Differential Revision: https://reviews.llvm.org/D73691	2020-02-04 14:51:06 +00:00
Filipe Cabecinhas	abada5036e	[NFC] Fix some spelling mistakes to test pushing to GH.	2020-02-04 11:07:31 +00:00
Simon Pilgrim	3dd688a9ee	[DAG] OptLevelChanger - fix uninitialized variable analyzer warning (PR44471) Ensure that OptLevelChanger::SavedFastISel is initialized in the constructor. This should be NFC - as the equivalent 'same opt level' early-out is used in the destructor as well, so SavedFastISel is only actually referenced in the general case. Differential Revision: https://reviews.llvm.org/D73875	2020-02-04 10:54:33 +00:00
David Green	362d00e051	[ARM][VecReduce] Force expand vector_reduce_fmin Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code.	2020-02-04 09:36:59 +00:00
Guillaume Chatelet	b8144c0536	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
David Blaikie	1ced28cbe7	DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF This code was incorrectly emitting extra bytes into arbitrary parts of the object file when it was meant to be hashing them to compute the DWO ID. Follow-up patch(es) will refactor this API somewhat to make such bugs harder to introduce, hopefully.	2020-02-03 19:16:42 -08:00
David Blaikie	031f83fb82	DebugInfo: Simplify emitDebugLocEntry by never passing a null CU	2020-02-03 18:47:14 -08:00
Matt Arsenault	cd7650c186	GlobalISel: Implement fewerElementsVector for G_SEXT_INREG Start using a new strategy with a combination of merge and unmerges. This allows scalarizing before lowering, which in cases like <2 x s128> avoids producing giant illegal shifts.	2020-02-03 11:47:33 -08:00
Quentin Colombet	f26ff8c9df	[TargetRegisterInfo] Make the heuristic to skip region split overridable by the target RegAllocGreedy uses a fairly compile time intensive splitting heuristic called region splitting. This heuristic was disabled via another heuristic when it is likely that it won't be worth the compile time. The only way to control this other heuristic was via a command line option (huge-size-for-split). This commit gives more control on this heuristic by making it overridable by the target using a target hook in TargetRegisterInfo called shouldRegionSplitForVirtReg. The default implementation of this hook keeps the heuristic as it was before this patch.	2020-02-03 11:30:35 -08:00
Simon Pilgrim	61621f826a	[TargetLowering] SimplifyDemandedBits - add basic KnownBits ZEXTLoad handling We have to be careful in SimplifyDemandedBits with loads in case we attempt to combine back to a constant (which then gets turned into a constant pool load again), but we can at least set the upper KnownBits for a ZEXTLoad to zero.	2020-02-03 16:50:04 +00:00
Guillaume Chatelet	333f2ad8b8	[Alignment][NFC] Use Align for getMemcpy/Memmove/Memset Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73885	2020-02-03 17:13:19 +01:00
Guillaume Chatelet	fc19465965	[Alignment][NFC] Use Align for code creating MemOp Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73874	2020-02-03 14:10:30 +01:00
Guillaume Chatelet	75d9994a51	Fix broken invariant Summary: A Copy with a source that is zeros is the same as a Set of zeros. This fixes the invariant that SrcAlign should always be non-null. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73791	2020-02-03 11:01:05 +01:00
Fangrui Song	44cdae68c3	[CodeGenPrepare] Delete dead !DL check Follow-up for D73754 DL is assigned in CodeGenPrepare::runOnFunction and is guaranteed to be non-null.	2020-02-02 09:49:06 -08:00
Fangrui Song	5a56a25b0b	[CodeGenPrepare] Make TargetPassConfig required The code paths in the absence of TargetMachine, TargetLowering or TargetRegisterInfo are poorly tested. As rL285987 said, requiring TargetPassConfig allows us to delete many (untested) checks littered everywhere. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73754	2020-02-02 09:28:45 -08:00
Fangrui Song	5932f7b8f2	[PatchableFunction] Use an empty DebugLoc The current FirstMI.getDebugLoc() is actually null in almost all cases. If it isn't, the generated .loc will be considered initial. The .loc will have the prologue_end flag and terminate the prologue prematurely. Also use an overload of BuildMI that will not prepend PATCHABLE_FUNCTION_ENTRY to a MachineInstr bundle.	2020-02-01 14:12:06 -08:00
Craig Topper	943b5561d6	[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops. This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes. It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from. This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle. Differential Revision: https://reviews.llvm.org/D73749	2020-02-01 11:21:04 -08:00
Matt Arsenault	bc101ffd77	GlobalISel: Support widening unmerge results with pointer source	2020-02-01 10:47:03 -05:00
David Blaikie	338beff4dc	DwarfDebug.cpp: Fix some indentation	2020-01-31 16:01:57 -08:00
David Blaikie	b33e5f3c3e	DebugInfo: Split DWARF: Hash non-member function child DIEs Significant missing hashing - as per the comment this was only meant to skip member functions (unspecified, but I think it's legible as member function declarations, not definitions) but was skipping all named subprograms (so only hashed child DIEs for member function definitions - because they didn't have a direct name, but only a name given indirectly in the DW_AT_specification-referenced DIE)	2020-01-31 15:32:03 -08:00
Matt Arsenault	792d9b5719	DAG: Check if a value is divergent before requiresUniformRegister This avoids a potentially expensive scan if we already know it doesn't matter.	2020-01-31 15:27:18 -08:00
Jay Foad	f465b1aff4	[GlobalISel] Tweak lowering of G_SMULO/G_UMULO Summary: Applying this cleanup: - MIRBuilder.buildInstr(TargetOpcode::G_ASHR) - .addDef(Shifted) - .addUse(Res) - .addUse(ShiftAmt); + MIRBuilder.buildAShr(Shifted, Res, ShiftAmt); caused an assertion failure here: llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:404: llvm::MachineInstr *llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() \|\| std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed. #4 0x00000000050a6d96 in llvm::MachineRegisterInfo::getVRegDef (this=0x74606a0, Reg=2147483650) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:403 #5 0x00000000066148f6 in llvm::getConstantVRegValWithLookThrough (VReg=2147483650, MRI=..., LookThroughInstrs=false, HandleFConstant=true) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:244 #6 0x00000000066147da in llvm::getConstantVRegVal (VReg=2147483650, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:210 #7 0x0000000006615367 in llvm::ConstantFoldBinOp (Opcode=101, Op1=2147483650, Op2=2147483656, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:341 #8 0x000000000657eee0 in llvm::CSEMIRBuilder::buildInstr (this=0x7465010, Opc=101, DstOps=..., SrcOps=..., Flag=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp:160 #9 0x0000000003645958 in llvm::MachineIRBuilder::buildAShr (this=0x7465010, Dst=..., Src0=..., Src1=..., Flags=...) at /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1298 #10 0x00000000065c35b1 in llvm::LegalizerHelper::lower (this=0x7fffffffb5f8, MI=..., TypeIdx=0, Ty=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2020 because at this point there are two instructions defining Res: the original G_SMULO/G_UMULO and the new G_MUL that we built. The fix is to modify the original mul in place, so that there is only ever one definition of Res. Reviewers: arsenm, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72842	2020-01-31 19:21:01 +00:00
Simon Pilgrim	8fbc7fd567	[DAG] SimplifyMultipleUseDemandedBits - peek through unused ISD::INSERT_SUBVECTOR subvectors If we don't demand any elements of the inserted subvector then just skip it.	2020-01-31 18:57:22 +00:00
Simon Pilgrim	5702dadf6f	[DAG] Enable ISD::INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::INSERT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-31 18:02:34 +00:00
Hiroshi Yamauchi	ac8da31a0f	[PGO][PGSO] Handle MBFIWrapper Some code gen passes use MBFIWrapper to keep track of the frequency of new blocks. This was not taken into account and could lead to incorrect frequencies as MBFI silently returns zero frequency for unknown/new blocks. Add a variant for MBFIWrapper in the PGSO query interface. Depends on D73494.	2020-01-31 09:36:55 -08:00
Jay Foad	2a1b5af299	[GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister Summary: As a side effect some redundant copies of constant values are removed by CSEMIRBuilder. Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73789	2020-01-31 17:07:16 +00:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Quentin Colombet	cfebd77742	[GISel][KnownBits] Fix a bug where we could run out of stack space One of the exit criteria of computeKnownBits is whether we reach the max recursive call depth. Before this patch we would check that the depth is exactly equal to max depth to exit. Depth may get bigger than max depth if it gets passed to a different GISelKnownBits object. This may happen when say a generic part uses a GISelKnownBits object with some max depth, but then we hit TL.computeKnownBitsForTargetInstr which creates a new GISelKnownBits object with a different and smaller depth. In that situation, when we hit the max depth check for the first time in the target specific GISelKnownBits object, depth may already be bigger than the current max depth. Hence we would continue to compute the known bits, until we ran through the full depth of the chain of computation or ran out of stack space. For instance, let say we have GISelKnownBits Info(/MaxDepth/ = 10); Info.getKnownBits(Foo) // 9 recursive calls to computeKnownBitsImpl. // Then we hit a target specific instruction. // The target specific GISelKnownBits does this: GISelKnownBits TargetSpecificInfo(/MaxDepth/ = 6) TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would // always return false. This commit does not have any test case, none of the in-tree targets use computeKnownBitsForTargetInstr.	2020-01-30 19:30:39 -08:00
Leonard Chan	2d3174c4df	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. This is an attempt to reland with a few fixes for buildbot since I haven't merged from master in a bit. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 17:09:42 -08:00
Amara Emerson	84bd851108	[GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required. We can have geps that have a scalar base pointer, and a vector index value, which means that the base pointer must be splatted into a vector of pointers. This fixes crashes on arm64 GlobalISel with optimizations enabled.	2020-01-30 16:27:27 -08:00
Leonard Chan	3b23453b6c	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `fff6a1b0f1`. This was breaking a bunch of buildbots.	2020-01-30 16:18:41 -08:00
Leonard Chan	fff6a1b0f1	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 15:58:37 -08:00
Matt Arsenault	eb7f74e300	CodeGen: Use Register	2020-01-30 15:01:56 -08:00
Sean Fertile	8b737688c2	[AIX] Minor cleanup in AsmPrinter. [NFC] - Extends the comments related to function descriptors, noting how they are only used on AIX. - Changes the condition used to gate the creation of the current function symbol in AsmPrinter::SetupMachineFunction to reflect being AIX specific. The creation of the symbol is different because of AIXs linkage conventions, not because AIX uses function descriptors. Differential Revision: https://reviews.llvm.org/D73115	2020-01-30 14:15:02 -05:00
Fangrui Song	06b8e32d4f	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after `9a24488cb6`, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680	2020-01-30 11:11:52 -08:00
Matt Arsenault	ea956685a1	GlobalISel: Implement s32->s64 G_FPTOSI lowering Port directly from DAG version. The lowering for G_FPTOUI used to fail on AMDGPU because it uses G_FPTOSI.	2020-01-30 08:47:07 -05:00
Dominik Montada	dc141af755	[GlobalISel] (fix) Use pointer type size for offset constant when lowering stores Commit `9965b12fd1` was supposed to change the offset constant when lowering load/stores, but only introduced this change for loads. This patch adds the same fix for stores.	2020-01-30 08:32:35 -05:00
Simon Pilgrim	57b0d33224	[DAGCombiner] ISD::AND/OR/XOR - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:53 +00:00
Simon Pilgrim	a967aa2706	[DAGCombiner] ISD::SDIV/UDIV/SREM/UREM - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:52 +00:00
Matt Arsenault	c5fffa4da3	GlobalISel: Add observer argument to legalizeIntrinsic This is passed to legalizeCustom, but not intrinsic. Also remove the MRI argument, since you can get that from the MachineIRBuilder. I'm not sure why MachineIRBuilder has a private observer member, and this is passed separately.	2020-01-29 18:33:45 -05:00
Amara Emerson	c12f046eb9	[GlobalISel] Add new combine to convert scalar G_MUL to G_SHL. For pow2 constants we should use G_SHL for pattern matching (and perf) purposes later. Vector support not yet implemented. Differential Revision: https://reviews.llvm.org/D73659	2020-01-29 13:39:00 -08:00
Amara Emerson	0da937bb5c	[GlobalISel][IRTranslator] Follow convention and put constant offset of getelementptr arithmetic on RHS. We were needlessly putting known constant values on the LHS of a G_MUL, which is suboptimal. Differential Revision: https://reviews.llvm.org/D73650	2020-01-29 11:37:19 -08:00
Fangrui Song	8903e61b66	[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects For `MC_GlobalAddress` operands referencing certain GlobalObjects, we can lower them to STB_LOCAL aliases to avoid costs brought by assembler/linker's conservative decisions about symbol interposition: * An assembler conservatively assumes a global default visibility symbol interposable (ELF semantics). So relocations in object files are needed even if the code generator assumed the definition exact and non-interposable. * The relocations can cause the creation of PLT entries on some targets for -shared links. A linker conservatively assumes a global default visibility symbol interposable (if not otherwise constrained by -Bsymbolic/--dynamic-list/VER_NDX_LOCAL/etc). "certain" refers to GlobalObjects in the intersection of `hasExactDefinition() and !isInterposable()`: `external`, `appending`, `internal`, `private`. Local linkages (`internal` and `private`) cannot be interposed. `appending` is for very few objects LLVM interpret specially. So the set just includes `external`. This patch emits STB_LOCAL aliases (.Lfoo$local) for such GlobalObjects, so that targets can lower MC_GlobalAddress operands to STB_LOCAL aliases if applicable. We may extend the scope and include GlobalAlias in the future. LLVM's existing -fno-semantic-interposition behaviors give us license to do such optimizations: * Various optimizations (ipconstprop, inliner, sccp, sroa, etc) treat normal ExternalLinkage GlobalObjects as non-interposable. * Before D72197, MC resolved a PC-relative VK_None fixup to a non-local symbol at assembly time (no outstanding relocation), if the target is defined in the same section. Put it simply, even if IR optimizations failed to optimize and allowed interposition for the function call in `void foo() {} void bar() { foo(); }`, the assembler would disallow it. This patch sets up AsmPrinter infrastructure to make -fno-semantic-interposition more so. With and without the patch, the object file output should be identical: `.Lfoo$local` does not take a symbol table entry. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D73228	2020-01-29 10:58:43 -08:00
Simon Pilgrim	f7245ef897	[DAGCombiner] ISD::SHL/SRA/SRL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 18:49:42 +00:00
Adrian Prantl	18dbe1b279	Run clang-format on DwarfExpression (NFC)	2020-01-29 10:23:12 -08:00
Adrian Prantl	816ee8a423	DwarfExpression: Factor out getOrCreateBaseType() (NFC)	2020-01-29 10:23:12 -08:00
Simon Pilgrim	25b8e96388	[DAGCombiner] ISD::MUL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 17:26:22 +00:00
Simon Pilgrim	4b04e11735	[DAGCombiner] Sub/SUBSAT - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 16:57:13 +00:00
Simon Pilgrim	48bd6a0986	[DAGCombiner] visitIMINMAX - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us instead of the ConstantSDNode variant where we have to do it ourselves.	2020-01-29 16:57:13 +00:00
Matt Arsenault	b63629a58d	GlobalISel: Fix mask computation in lowerInsert This is supposed to be the high bit index, not the width. Use the wrapping form of getBitsSet and avoid the bitflip.	2020-01-29 08:25:36 -08:00
Jay Foad	0d7bd34312	[MachineScheduler] Ignore artificial edges when forming store chains Summary: BaseMemOpClusterMutation::apply forms store chains by looking for control (i.e. non-data) dependencies from one mem op to another. In the test case, clusterNeighboringMemOps successfully clusters the loads, and then adds artificial edges to the loads' successors as described in the comment: // Copy successor edges from SUa to SUb. Interleaving computation // dependent on SUa can prevent load combining due to register reuse. The effect of this is that data dependencies from one load to a store are copied as artificial dependencies from a different load to the same store. Then when BaseMemOpClusterMutation::apply looks at the stores, it finds that some of them have a control dependency on a previous load, which breaks the chains and means that the stores are not all considered part of the same chain and won't all be clustered. The fix is to only consider non-artificial control dependencies when forming chains. Subscribers: MatzeB, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71717	2020-01-29 16:23:01 +00:00
Matt Arsenault	f717483acd	GlobalISel: Assert on invalid bitcast in MIRBuilder The other casts validate, so this should too.	2020-01-29 07:49:39 -08:00
Matt Arsenault	c5c1bb3374	GlobalISel: Lower G_WRITE_REGISTER	2020-01-29 06:48:24 -08:00
David Stenberg	6a2413c435	[ARM64] Debug info for structure argument missing DW_AT_location Summary: Prevent eliminating dbg_val due to COPY. Fixes this https://bugs.llvm.org/show_bug.cgi?id=40709 Patch by: Kamlesh Kumar (kamleshbhalui) Reviewers: aprantl, dblaikie, vsk, dsanders Reviewed By: dsanders Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73159	2020-01-29 10:56:23 +01:00
Sam Parker	ac30ea2f87	[RDA][ARM] Move functionality into RDA Add several new helpers to RDA: - hasLocalDefBefore - isRegDefinedAfter - isSafeToDefRegAt And move two bits of logic from ARMLowOverheadLoops into RDA: - isSafeToMove - isSafeToRemove Both of these have some wrappers too to make them more convienent to use. Differential Revision: https://reviews.llvm.org/D73460	2020-01-29 03:27:47 -05:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Michael Spang	a2fb2c0ddc	[GlobalMerge] Preserve symbol visibility when merging globals Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235	2020-01-28 13:26:18 -08:00
Hiroshi Yamauchi	2c03c899d5	[MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC. Summary: To avoid header file circular dependency issues in passing updated MBFI (in MBFIWrapper) to the interface of profile guided size optimizations. A prep step for (and split off of) D73381. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73494	2020-01-28 10:58:46 -08:00
Sam Parker	7ad879caa0	[NFC][RDA] typedef SmallPtrSetImpl<MachineInstr*>	2020-01-28 13:15:44 +00:00
Wang, Pengfei	3239b5034e	[FPEnv] Add pragma FP_CONTRACT support under strict FP. Summary: Support pragma FP_CONTRACT under strict FP. Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3 Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72820	2020-01-28 20:43:43 +08:00
Guillaume Chatelet	879c825cb8	[instrinsics] Add @llvm.memcpy.inline instrinsics Summary: This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++. A follow up CL will add the intrinsics in Clang. Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71710	2020-01-28 09:42:01 +01:00
Fangrui Song	c7c5da6df3	Reland "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"" Reland `7a8b0b1595`, with a fix that checks `!E.value().empty()` to avoid inserting a zero to SlotRemap. Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73510	2020-01-27 15:58:49 -08:00
Jay Foad	cbbbd5b5f6	[GlobalISel] Make use of KnownBits::computeForAddSub Summary: This is mostly NFC. computeForAddSub may give more precise results in some cases, but that doesn't seem to affect any existing GlobalISel tests. Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73431	2020-01-27 22:22:56 +00:00
Simon Pilgrim	e7e043724e	[DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow. Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-27 21:17:47 +00:00
Adrian Prantl	a095d149c2	Fix an assertion failure in DwarfExpression's subregister composition This patch fixes an assertion failure in DwarfExpression that is triggered when a complex fragment has exactly the size of a subregister of the register the DBG_VALUE points to and there is no DWARF encoding for the super-register. I took the opportunity to replace/document some magic values with static constructor functions to make this code less confusing to read. rdar://problem/58489125 Differential Revision: https://reviews.llvm.org/D72938	2020-01-27 12:44:37 -08:00
Vedant Kumar	e08f205f5c	Reland (again): [DWARF] Allow cross-CU references of subprogram definitions This is a revert-of-revert (i.e. this reverts commit `802bec89`, which itself reverted `fa4701e1` and `79daafc9`) with a fix folded in. The problem was that call site tags weren't emitted properly when LTO was enabled along with split-dwarf. This required a minor fix. I've added a reduced test case in test/DebugInfo/X86/fission-call-site.ll. Original commit message: This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. Update #1: Reland with a fix to create a declaration DIE when the declaration is missing from the CU's retainedTypes list. The declaration is left out of the retainedTypes list in two cases: 1) Re-compiling pre-r266445 bitcode (in which declarations weren't added to the retainedTypes list), and 2) Doing LTO function importing (which doesn't update the retainedTypes list). It's possible to handle (1) and (2) by modifying the retainedTypes list (in AutoUpgrade, or in the LTO importing logic resp.), but I don't see an advantage to doing it this way, as it would cause more DWARF to be emitted compared to creating the declaration DIEs lazily. Update #2: Fold in a fix for call site tag emission in the split-dwarf + LTO case. Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a ReleaseLTO-g build of the test suite. rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440 Differential Revision: https://reviews.llvm.org/D70350	2020-01-27 10:52:34 -08:00
Nico Weber	68051c1224	Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()" This reverts commit `7a8b0b1595`. It seems to break exception handling on 32-bit Windows, see https://crbug.com/1045650	2020-01-27 11:22:33 -05:00
Dominik Montada	9965b12fd1	Use pointer type size for offset constant when lowering load/stores	2020-01-27 06:55:32 -08:00
Matt Arsenault	2a160ba5b0	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
Matt Arsenault	06d9230fef	GlobalISel: Translate vector GEPs	2020-01-27 05:35:05 -08:00
Igor Kudrin	8f3d47c54a	[DWARF] Do not pass Version to DWARFExpression. NFCI. The Version was used only to determine the size of an operand of DW_OP_call_ref. The size was 4 for all versions apart from 2, but the DW_OP_call_ref operation was introduced only in DWARF3. Thus, the code may be simplified and using of Version may be eliminated. Differential Revision: https://reviews.llvm.org/D73264	2020-01-27 19:08:46 +07:00
David Stenberg	13d4ef9ac0	Improvements to call site register worklist Summary: This fixes PR44118. For cases where we have a chain like this: R8 = R1 (entry value) R0 = R8 call @foo R0 the code that emits call site entries using entry values would not follow that chain, instead emitting a call site entry with R8 as location rather than R0. Such a case was discovered when originally adding dbgcall-site-orr-moves.mir. This patch fixes that issue. This is done by changing the ForwardedRegWorklist set to a map in which the worklist registers always map to the parameter registers that they describe. Another thing this patch fixes is that worklist registers now can describe more than one parameter register at a time. Such a case occurred in dbgcall-site-interpretation.mir, resulting in a call site entry not being emitted for one of the parameters. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73168	2020-01-27 12:41:42 +01:00
David Stenberg	b46baa82fc	Don't separate imp/expl def handling for call site params Summary: Since D70431 the describeLoadedValue() hook takes a parameter register, meaning that it can now be asked to describe any register. This means that we can drop the difference between explicit and implicit defines that we previously had in collectCallSiteParameters(). I have not found any case for any upstream targets where a parameter register is only implicitly defined, and does not overlap with any explicit defines. I don't know if such a case would even make sense. So as far as I have tested, this patch should be a non-functional change. However, this reduces the complexity of the code a bit, and it will simplify the implementation of an upcoming patch which solves PR44118. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73167	2020-01-27 11:31:09 +01:00
Petar Avramovic	cbf03aee6d	[MIPS GlobalISel] Select population count (popcount) G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates these intrinsics from __builtin_popcount and __builtin_popcountll. Add lower and narrow scalar for G_CTPOP. Lower G_CTPOP for MIPS32. Differential Revision: https://reviews.llvm.org/D73216	2020-01-27 09:59:50 +01:00
Petar Avramovic	8bc7ba5b9e	[MIPS GlobalISel] Select count trailing zeros llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_ctz and __builtin_ctzll. G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true). Clang generates such intrinsics as parts of expansion of builtin_ffs and builtin_ffsll. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ. Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32. Differential Revision: https://reviews.llvm.org/D73215	2020-01-27 09:51:06 +01:00
Petar Avramovic	2b66d32f3f	[MIPS GlobalISel] Select count leading zeros llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. MIPS clz instruction returns 32 for zero input. G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_clz and __builtin_clzll. G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as second argument. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF). Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32. Differential Revision: https://reviews.llvm.org/D73214	2020-01-27 09:43:38 +01:00
Fangrui Song	941f20c3bd	[MachineVerifier] Simplify and delete LLVM_VERIFY_MACHINEINSTRS from a comment. NFC The environment variable has been unused since r228079.	2020-01-27 00:31:23 -08:00
Wang, Pengfei	17b8f96d65	[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI. Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes. Differential Revision: https://reviews.llvm.org/D72871	2020-01-27 10:38:05 +08:00
Simon Pilgrim	4a5f9d9faf	[TargetLowering] Respect recursive depth in SimplifyDemandedBits call to ComputeNumSignBits	2020-01-26 10:01:56 +00:00
Simon Pilgrim	3daa71ee00	[SelectionDAG] ComputeNumSignBits - add DemandedElts support for MIN/MAX ops	2020-01-25 20:21:14 +00:00
Simon Pilgrim	3f8916b2e8	[SelectionDAG] ComputeNumSignBits - add support for rotate non-uniform vector amounts	2020-01-25 19:15:05 +00:00
Simon Pilgrim	e3c26a9d1b	[SelectionDAG] ComputeNumSignBits - add support for rotate uniform vector amounts	2020-01-25 18:55:47 +00:00
Simon Pilgrim	c8de7c8f50	[TargetLowering] SimplifyDemandedBits - Remove ashr if all our demandedbits already match the sign bit Differential Revision: https://reviews.llvm.org/D73412	2020-01-25 17:36:46 +00:00
Vedant Kumar	802bec8961	Revert "Reland: [DWARF] Allow cross-CU references of subprogram definitions" ... as well as: Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info" This reverts commit `fa4701e197`. This reverts commit `79daafc903`. There have been reports of this assert getting hit: CalleeDIE && "Could not find DIE for call site entry origin	2020-01-24 18:07:54 -08:00
Quentin Colombet	5d87b5d202	[GISelKnownBits] Add support for PHIs Teach the GISelKnowBits analysis how to deal with PHI operations. PHIs are essentially COPYs happening on edges, so we can just reuse the code for COPY. This is NFC COPY-wise has we leave Depth untouched when calling computeKnownBitsImpl for COPYs, like it was before this patch. Increasing Depth is however required for PHIs as they may loop back to themselves and we would end up in an infinite loop if we were not increasing Depth. Differential Revision: https://reviews.llvm.org/D73317	2020-01-24 16:43:52 -08:00
@justice_adams (Justice Adams)	daee63f974	[SelectionDag] Updated FoldConstantArithmetic method signature in preparation for merge with FoldConstantVectorArithmetic Updated FoldConstantArithmetic method signature to match that of FoldConstantVectorArithmetic in preparation for merging the two functions together https://bugs.llvm.org/show_bug.cgi?id=36544 This is the first step in combining the various FoldConstantVectorArithmetic and FoldConstantVectorArithmetic functions into one FoldConstantArithmetic function. Differential Revision: https://reviews.llvm.org/D72870	2020-01-24 18:00:58 -05:00
Craig Topper	d3bf06bc81	[DAGCombiner] Add combine for (not (strict_fsetcc)) to create a strict_fsetcc with the opposite condition. Unlike the existing code that I modified here, I only handle the case where the strict_fsetcc has a single use. Not sure exactly how to handle multiples uses. Testing this on X86 is hard because we already have a other combines that get rid of lowered version of the integer setcc that this xor will eventually become. So this combine really just saves a bunch of extra nodes being created. Not sure about other targets. Differential Revision: https://reviews.llvm.org/D71816	2020-01-24 14:15:36 -08:00
Stanislav Mekhanoshin	be8e38cbd9	Correct NumLoads in clustering Scheduler sends NumLoads argument into shouldClusterMemOps() one less the actual cluster length. So for 2 instructions it will pass just 1. Correct this number. This is NFC for in tree targets. Differential Revision: https://reviews.llvm.org/D73292	2020-01-24 12:45:28 -08:00
Stanislav Mekhanoshin	7a94d4f4ee	Allow combining of extract_subvector to extract element Differential Revision: https://reviews.llvm.org/D73132	2020-01-24 10:50:26 -08:00
Fangrui Song	50a3ff30e1	[PatchableFunction] Allow empty entry MachineBasicBlock Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301	2020-01-24 09:42:48 -08:00
Tom Weaver	f5147765ba	[DebugInfo][LiveDebugValues] Teach Live Debug Values About Meta Instructions Previously LiveDebugValues pass would consider meta instructions that 'fiddle' with liveness of registers as register definitions when transfering register defs. This would mean that, for example, a KILL instruction would cause LiveDebugValues to terminate the range of an earlier DBG_VALUE instruction resulting in the none propogation of said DBG_VALUE instructions into later blocks. This patch adds the check and a helpful comment, fixes a test that previously tested for the broken behaviour by coincidence and adds a test specifically for this. reviewers: vsk, dstenb, djtodoro Differential Revision: https://reviews.llvm.org/D73210	2020-01-24 16:29:05 +00:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Simon Pilgrim	0b45c2264a	[SelectionDAG] rot(x, y) --> x iff ComputeNumSignBits(x) == BitWidth(x) Rotating an 0/-1 value by any amount will always result in the same 0/-1 value	2020-01-24 10:35:57 +00:00
Fangrui Song	22467e2595	Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070	2020-01-23 17:02:27 -08:00

1 2 3 4 5 ...

28070 Commits