llvm-project

Commit Graph

Author	SHA1	Message	Date
Kuter Dinel	a78671ef54	[FIX][Attributor] Fix broken build due to missing virtual deconstructors. The lack some virtual deconstructors where causing some builds bots to fail. This patch fixes that. Problematic commit: https://reviews.llvm.org/rGeaf1b6810ce0f40008b2b1d902750eafa3e198d3 Build bot: https://lab.llvm.org/buildbot/#/builders/18/builds/1741	2021-06-18 07:32:51 +03:00
Kuter Dinel	eaf1b6810c	[Attributor] Derive AACallEdges attribute This attribute computes the optimistic live call edges using the attributor liveness information. This attribute will be used for deriving a inter-procedural function reachability attribute. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104059	2021-06-18 03:29:22 +03:00
Jon Roelofs	a2ab765029	[GISel] Eliminate redundant bitmasking This was a GISel vs SDAG regression that showed up at -Os on arm64 in: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.test https://llvm.godbolt.org/z/aecjodsjG Differential revision: https://reviews.llvm.org/D103334	2021-06-17 12:53:00 -07:00
Jorge Gorbe Moya	9ac7388e3d	Revert "[NFC] Remove checking pointee type for byval/preallocated type" This reverts commit `738abfdbea`.	2021-06-17 12:29:23 -07:00
Saleem Abdulrasool	bbea64250f	RISCV: adjust handling of relocation emission for RISCV This re-architects the RISCV relocation handling to bring the implementation closer in line with the implementation in binutils. We would previously aggressively resolve the relocation. With this restructuring, we always will emit a paired relocation for any symbolic difference of the type of S±T[±C] where S and T are labels and C is a constant. GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE` which indicates that a fixup may be expanded into multiple relocations. This is used by the RISCV backend to always emit a paired relocation - either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] + SUB[WIDTH] for a debug info relocation. Irrespective of whether linker relaxation support is enabled, symbolic difference is always emitted as a paired relocation. This change also sinks the target specific behaviour down into the target specific area rather than exposing it to the shared relocation handling. In the process, we also sink the "special" handling for debug information down into the RISCV target. Although this improves the path for the other targets, this is not necessarily entirely ideal either. The changes in the debug info emission could be done through another type of hook as this functionality would be required by any other target which wishes to do linker relaxation. However, as there are no other targets in LLVM which currently do this, this is a reasonable thing to do until such time as the code needs to be shared. Improve the handling of the relocation (and add a reduced test case from the Linux kernel) to ensure that we handle complex expressions for symbolic difference. This ensures that we correct relocate symbols with the adddends normalized and associated with the addition portion of the paired relocation. This change also addresses some review comments from Alex Bradbury about the relocations meant for use in the DWARF CFA being named incorrectly (using ADD6 instead of SET6) in the original change which introduced the relocation type. This resolves the issues with the symbolic difference emission sufficiently to enable building the Linux kernel with clang+IAS+lld (without linker relaxation). Resolves PR50153, PR50156! Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143 Reviewed By: nickdesaulniers, maskray Differential Revision: https://reviews.llvm.org/D103539	2021-06-17 08:20:02 -07:00
Stephen Tozer	dee2c76b4c	Reapply "[DebugInfo] Prevent non-determinism when updating DIArgList users of a value" Reapply the commit which previously caused build failures due to the mismatched template arguments between the return type and the returned SmallVector. This reverts commit `e8991caea8`.	2021-06-17 16:16:55 +01:00
Guillaume Chatelet	26f1f6d0de	[llvm] fix typo in comment	2021-06-17 14:30:52 +00:00
David Green	fda8b4714e	[InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass The Interleave Access pass will convert shuffle(binop(load, load)) to binop(shuffle(load), shuffle(load)), in order to create more interleaving load patterns (VLD2/3/4) that might have been messed up by instcombine. As shown in D104247 we were missing copying IR flags to the new instruction though, which should just be kept the same as the original instruction. Differential Revision: https://reviews.llvm.org/D104255	2021-06-17 09:53:33 +01:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Lang Hames	838490de7e	[ORC] Switch from uint8_t to char buffers for TargetProcessControl::runWrapper. This matches WrapperFunctionResult's char buffer, cutting down on the number of pointer casts needed.	2021-06-17 13:27:09 +10:00
Adrian Prantl	f9aba9a5af	Move the definition of LLVM_SUPPORT_XCODE_SIGNPOSTS into llvm-config.h since it is now used by a public header file (Signposts.h). This fixes the standalone LLDB build.	2021-06-16 14:40:37 -07:00
Min-Yih Hsu	c29555342c	[MCA] Anchoring the vtable of CustomBehaviour Put the dtor of mca::CustomBehaviour into the cpp file to avoid undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared library. Differential Revision: https://reviews.llvm.org/D104401	2021-06-16 12:43:58 -07:00
Hongtao Yu	cef9b96b01	[CSSPGO] Report zero-count probe in profile instead of dangling probes. Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits: 1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode. 2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader. 3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes. 4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples. 5. Better readability. 6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler. Note that the current patch does include any work for #3. There will be follow-up changes. For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of LBRProfileSection is reduced by 35%. For #4, I have seen general counts quality for SPEC2017 is improved by 10%. Reviewed By: wenlei, wlei, wmi Differential Revision: https://reviews.llvm.org/D104129	2021-06-16 11:45:29 -07:00
Sushma Unnibhavi	2193347e72	[M68k][GloballSel] Adding initial GlobalISel infrastructure Wiring up GlobalISel for the M68k backend Differential Revision: https://reviews.llvm.org/D101819	2021-06-16 10:48:38 -06:00
Patrick Holland	ef16c8eaa5	Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca". The original change was pushed in main as commit `f7a23ecece`. It was then reverted by commit `a04f01bab2` because it caused linker failures on buildbots that don't build the AMDGPU target. -- Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. More details are available in the original commit log message (`f7a23ecece`). Differential Revision: https://reviews.llvm.org/D104149	2021-06-16 16:54:48 +01:00
David Spickett	e4ecd83fe9	[llvm][AArch64] Handle arrays of struct properly (from IR) This only applies to FastIsel. GlobalIsel seems to sidestep the issue. This fixes https://bugs.llvm.org/show_bug.cgi?id=46996 One of the things we do in llvm is decide if a type needs consecutive registers. Previously, we just checked if it was an array or not. (plus an SVE specific check that is not changing here) This causes some confusion when you arbitrary IR like: ``` %T1 = type { double, i1 }; define [ 1 x %T1 ] @foo() { entry: ret [ 1 x %T1 ] zeroinitializer } ``` We see it is an array so we call CC_AArch64_Custom_Block which bails out when it sees the i1, a type we don't want to put into a block. This leaves the location of the double in some kind of intermediate state and leads to odd codegen. Which then crashes the backend because it doesn't know how to implement what it's been asked for. You get this: ``` renamable $d0 = FMOVD0 $w0 = COPY killed renamable $d0 ``` Rather than this: ``` $d0 = FMOVD0 $w0 = COPY $wzr ``` The backend knows how to copy 64 bit to 64 bit registers, but not 64 to 32. It can certainly be taught how but the real issue seems to be us even trying to assign a register block in the first place. This change makes the logic of AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters a bit more in depth. If we find an array, also check that all the nested aggregates in that array have a single member type. Then CC_AArch64_Custom_Block's assumption of a type that looks like [ N x type ] will be valid and we get the expected codegen. New tests have been added to exercise these situations. Note that some of the output is not ABI compliant. The aim of this change is to simply handle these situations and not to make our processing of arbitrary IR ABI compliant. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104123	2021-06-16 13:56:01 +00:00
Andrea Di Biagio	70b37f4c03	[MCA][InstrBuilder] Always check for implicit uses of resource units (PR50725). When instructions are issued to the underlying pipeline resources, the mca::ResourceManager should also check for the presence of extra uses induced by the explicit consumption of multiple partially overlapping group resources. Fixes PR50725	2021-06-16 14:51:12 +01:00
James Henderson	5c1639fe06	[yaml2obj][obj2yaml] Support custom ELF section header string table name This patch adds support for a new field in the FileHeader, which states the name to use for the section header string table. This also allows combining the string table with another string table in the object, e.g. the symbol name string table. The field is optional. By default, .shstrtab will continue to be used. This partially fixes https://bugs.llvm.org/show_bug.cgi?id=50506. Reviewed by: Higuoxing Differential Revision: https://reviews.llvm.org/D104035	2021-06-16 10:02:23 +01:00
Lang Hames	834616146b	[ORC] Switch to WrapperFunction utility for calls to registration functions. Addresses FIXMEs in TPC-based EH-frame and debug object registration code by replacing manual argument serialization with WrapperFunction utility calls.	2021-06-16 18:05:58 +10:00
Rong Xu	82a0bb1afc	[SampleFDO] Place the discriminator flag variable into the used list. We create flag variable "__llvm_fs_discriminator__" in the binary to indicate that FSAFDO hierarchical discriminators are used. This variable might be GC'ed by the linker since it is not explicitly reference. I initially added the var to the use list in pass MIRFSDiscriminator but it did not work. It turned out the used global list is collected in lowering (before MIR pass) and then emitted in the end of pass pipeline. Here I add the variable to the use list in IR level's AddDiscriminators pass. The machine level code is still keep in the case IR's AddDiscriminators is not invoked. If this is the case, this just use -Wl,--export-dynamic-symbol=__llvm_fs_discriminator__ to force the emit. Differential Revision: https://reviews.llvm.org/D103988	2021-06-15 21:51:04 -07:00
Andrea Di Biagio	a04f01bab2	Revert "[MCA] Adding the CustomBehaviour class to llvm-mca" This reverts commit `f7a23ecece`. It appears to breaks buildbots that don't build the AMDGPU backend.	2021-06-15 21:41:36 +01:00
Patrick Holland	f7a23ecece	[MCA] Adding the CustomBehaviour class to llvm-mca Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. Implementation details: llvm-mca does its best to extract relevant register, resource, and memory information from every MCInst when lowering them to an mca::Instruction. It then uses this information to detect dependencies and simulate stalls within the pipeline. For some instructions, the information that gets captured within the mca::Instruction is not enough for mca to simulate them properly. In these cases, there are two main possibilities: 1. The instruction has a dependency that isn’t detected by mca. 2. mca is incorrectly enforcing a dependency that shouldn’t exist. For the rest of this discussion, I will be focusing on (1), but I have put some thought into (2) and I may revisit it in the future. So we have an instruction that has dependencies that aren’t picked up by mca. The basic idea for both pipelines in mca is that when an instruction wants to be dispatched, we first check for register hazards and then we check for resource hazards. This is where CB is injected. If no register or resource hazards have been detected, we make a call to CustomBehaviour::checkCustomHazard() to give the target specific CB the chance to detect and enforce any custom dependencies. The return value for checkCustomHazaard() is an unsigned int representing the (minimum) number of cycles that the instruction needs to stall for. It’s fine to underestimate this value because when StallCycles gets down to 0, we’ll end up checking for all the hazards again before the instruction is actually dispatched. However, it’s important not to overestimate the value and the more accurate your estimate is, the more efficient mca’s execution can be. In general, for checkCustomHazard() to be able to detect these custom dependencies, it needs information about the current instruction and also all of the instructions that are still executing within the pipeline. The mca pipeline uses mca::Instruction rather than MCInst and the current information encoded within each mca::Instruction isn’t sufficient for my use cases. I had to add a few extra attributes to the mca::Instruction class and have them get set by the MCInst during instruction building. For example, the current mca::Instruction doesn’t know its opcode, and it also doesn’t know anything about its immediate operands (both of which I had to add to the class). With information about the current instruction, a list of all currently executing instructions, and some target specific objects (MCSubtargetInfo and MCInstrInfo which the base CB class has references to), developers should be able to detect and enforce most custom dependencies within checkCustomHazard. If you need more information than is present in the mca::Instruction, feel free to add attributes to that class and have them set during the lowering sequence from MCInst. Fortunately, in the in-order pipeline, it’s very convenient for us to pass these arguments to checkCustomHazard. The hazard checking is taken care of within InOrderIssueStage::canExecute(). This function takes a const InstRef as a parameter (representing the instruction that currently wants to be dispatched) and the InOrderIssueStage class maintains a SmallVector<InstRef, 4> which holds all of the currently executing instructions. For the out-of-order pipeline, it’s a bit trickier to get the list of executing instructions and this is why I have held off on implementing it myself. This is the main topic I will bring up when I eventually make a post to discuss and ask for feedback. CB is a base class where targets implement their own derived classes. If a target specific CB does not exist (or we pass in the -disable-cb flag), the base class is used. This base class trivially returns 0 from its checkCustomHazard() implementation (meaning that the current instruction needs to stall for 0 cycles aka no hazard is detected). For this reason, targets or users who choose not to use CB shouldn’t see any negative impacts to accuracy or performance (in comparison to pre-patch llvm-mca). Differential Revision: https://reviews.llvm.org/D104149	2021-06-15 21:30:48 +01:00
Vitaly Buka	a99f6d3071	[NFC] Fix "unused variable" warning	2021-06-15 12:59:05 -07:00
Jinsong Ji	9ddb625890	[NFC] Update renamed option in comments `c98ebda325` Rename fp-op fusion option (yet again) for compatibility with GCC option. The comment in the header should be updated too to avoid confusion.	2021-06-15 19:44:31 +00:00
Duncan P. N. Exon Smith	3302af9d4c	Support: Remove F_{None,Text,Append} compatibility synonyms, NFC Remove the compatibility spellings of `OF_{None,Text,Append}` that were left behind by `1f67a3cba9`. No functionality change here, just an API cleanup. Differential Revision: https://reviews.llvm.org/D101506	2021-06-15 12:04:09 -07:00
Roman Lebedev	e52364532a	[NewPM] Remove SpeculateAroundPHIs pass Addition of this pass has been botched. There is no particular reason why it had to be sold as an inseparable part of new-pm transition. It was added when old-pm was still the default, and very very few users were actually tracking new-pm, so it's effects weren't measured. Which means, some of the turnoil of the new-pm transition are actually likely regressions due to this pass. Likewise, there has been a number of post-commit feedback (post new-pm switch), namely * https://reviews.llvm.org/D37467#2787157 (regresses HW-loops) * https://reviews.llvm.org/D37467#2787259 (should not be in middle-end, should run after LSR, not before) * https://reviews.llvm.org/D95789 (an attempt to fix bad loop backedge metadata) and in the half year past, the pass authors (google) still haven't found time to respond to any of that. Hereby it is proposed to backout the pass from the pipeline, until someone who cares about it can address the issues reported, and properly start the process of adding a new pass into the pipeline, with proper performance evaluation. Furthermore, neither google nor facebook reports any perf changes from this change, so i'm dropping the pass completely. It can always be re-reverted should/if anyone want to pick it up again. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D104099	2021-06-15 20:35:55 +03:00
Lang Hames	0672d5d104	[ORC] Fix missing std::move.	2021-06-15 21:42:58 +10:00
Lang Hames	48fb8ecf44	[ORC] Fix narrowing-in-initializer-list warnings.	2021-06-15 21:39:16 +10:00
Lang Hames	5a28bdeeb6	[ORC] Fix missing function in unit test.	2021-06-15 21:39:00 +10:00
Lang Hames	5188b9af84	[ORC] Make WrapperFunctionResult's ValuePtr member non-const. The const qualifier was a hangover from an earlier iteration that allowed wrapper functions to return pointers to const memory. This feature has been removed, so there's no reason for this to be const any more, and removing it eliminates const-cast warnings.	2021-06-15 21:24:12 +10:00
Lang Hames	4eb9fe2e1a	[ORC] Port WrapperFunctionUtils and SimplePackedSerialization from ORC runtime. Replace the existing WrapperFunctionResult type in llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a version adapted from the ORC runtime's implementation. Also introduce the SimplePackedSerialization scheme (also adapted from the ORC runtime's implementation) for wrapper functions to avoid manual serialization and deserialization for calls to runtime functions involving common types.	2021-06-15 21:13:57 +10:00
Neil Henning	1540da3b78	ABI breaking changes fixes. This commit mostly just replaces bad uses of `NDEBUG` with uses of `LLVM_ENABLE_ABI_BREAKING_CHANGES` - the safe way to include ABI breaking changes (normally extra struct elements in headers). Differential Revision: https://reviews.llvm.org/D104216	2021-06-15 11:08:13 +01:00
Jay Foad	f5dc511c53	[IR] Remove forward declaration of GraphTraits from Type.h This has been unnecessary since r352353 removed GraphTraits specializations for Type, except that a couple of other headers were accidentally relying on this declaration. Differential Revision: https://reviews.llvm.org/D104119	2021-06-15 09:23:45 +01:00
Adrian Prantl	035217ff51	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previously reverted patch with additional include order fixes for non-modular builds of LLDB. Differential Revision: https://reviews.llvm.org/D103575	2021-06-14 16:53:41 -07:00
Adrian Prantl	7a7c00761f	Revert "Allow signposts to take advantage of deferred string substitution" This reverts commit `03841edde7`. Unfortunately this still breaks the LLDB standalone bot.	2021-06-14 16:09:04 -07:00
Adrian Prantl	03841edde7	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previsously reverted patch with additional MachO.h macro #undefs. Differential Revision: https://reviews.llvm.org/D103575	2021-06-14 14:19:41 -07:00
wlei	863184dd69	[CSSPGO] Aggregation by the last K context frames for cold profiles This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one. This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D104131	2021-06-14 10:33:43 -07:00
zhijian	7ed515d168	[AIX][XCOFF] emit vector info of traceback table. Summary: emit vector info of traceback table. Reviewers: Jason Liu,Hubert Tong Differential Revision: https://reviews.llvm.org/D93659	2021-06-14 11:15:22 -04:00
Florian Hahn	d767d1dd2c	[ADT] Use unnamed argument for unused arg in StringMapEntryStorage. This silences an 'unsused argument' warning. Similar to `c2006f857d`.	2021-06-14 15:54:57 +01:00
Jeroen Dobbelaere	bb8ce25e88	Intrinsic::getName: require a Module argument Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary. This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288 (as mentioned by @fhahn), after committing D91250. Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99173	2021-06-14 14:52:29 +02:00
Guillaume Chatelet	1d49e5352f	[llvm] remove Sequence::asSmallVector() There's no need for `toSmallVector()` as `SmallVector.h` already provides a `to_vector` free function that takes a range. Reviewed By: Quuxplusone Differential Revision: https://reviews.llvm.org/D104024	2021-06-14 08:28:05 +00:00
Simon Moll	74d45b884c	[VP] Binary floating-point intrinsics. This patch implements vector-predicated intrinsics on IR level for fadd, fsub, fmul, fdiv and frem. There operate in the default floating-point environment. We will use constrained fp operand bundles for constrained vector-predicated fp math (D93455). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93470	2021-06-14 08:51:41 +02:00
Xuanda Yang	e0bb502064	[LLParser] Remove outdated deplibs The comment mentions deplibs should be removed in 4.0. Removing it in this patch. Reviewed By: compnerd, dexonsmith, lattner Differential Revision: https://reviews.llvm.org/D102763	2021-06-14 12:46:12 +08:00
RamNalamothu	167e7afcd5	Implement DW_CFA_LLVM_* for Heterogeneous Debugging Add support in MC/MIR for writing/parsing, and DebugInfo. This is part of the Extensions for Heterogeneous Debugging defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html Specifically the CFI instructions implemented here are defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D76877	2021-06-14 08:51:50 +05:30
Simon Pilgrim	4089e0bbfa	RawError.h - remove unused <string> include. NFCI.	2021-06-13 17:32:57 +01:00
Simon Pilgrim	033e594c59	DIPrinter.h - tidy implicit header dependencies. NFCI. We don't use <string> but we do use std::unique_ptr (<memory>) and llvm::Optional<>	2021-06-13 17:00:15 +01:00
Simon Pilgrim	3dc727e81b	ProfiledCallGraph.h - remove unused <string> include. NFCI.	2021-06-13 15:19:25 +01:00
Florian Hahn	b4583a5ad7	Revert "Allow signposts to take advantage of deferred string substitution" This reverts commit `4fc93a3a1f` because it breaks LLDB builds on certain macOS platform & SDK combinations, e.g. http://green.lab.llvm.org/green/job/lldb-cmake-standalone/3288/consoleFull#-195476041949ba4694-19c4-4d7e-bec5-911270d8a58c	2021-06-12 12:08:25 +01:00
Florian Hahn	5cd66420cc	Revert "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB" This reverts commit `1b748faf2b` because it breaks building the llvm-test-suite with -verify-machineinstrs on X86: http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/9585/ Running llc -verify-machineinstr on X86 crashes on the IR below: target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" %struct.widget = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [16 x [16 x i16]], [6 x [32 x i32]], [16 x [16 x i32]], [4 x [12 x [4 x [4 x i32]]]], [16 x i32], i8, i32, i32*, i32, i32, i32, i32, i32, %struct.baz, %struct.wobble.1, i32, i32, i32, i32, i32, i32, %struct.quux.2, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32**, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [3 x [2 x i32]], i32, i32, i64, i64, %struct.zot.3, %struct.zot.3, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } %struct.baz = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.snork, %struct.wombat.0, %struct.wobble, i32, i32, i32, i32, i32, i32, i32, i32, i32 (%struct.widget, %struct.eggs), i32, i32, i32, i32 } %struct.snork = type { %struct.spam, %struct.zot, i32 (%struct.wombat, %struct.widget, %struct.snork) } %struct.spam = type { i32, i32, i32, i32, i8, i32 } %struct.zot = type { i32, i32, i32, i32, i32, i8, i32* } %struct.wombat = type { i32, i32, i32, i32, i32, i32, i32, i32, void (i32, i32, i32, i32), void (%struct.wombat, %struct.widget, %struct.zot)* } %struct.wombat.0 = type { [4 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] } %struct.quux = type { i16, i8 } %struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] } %struct.eggs = type { [1000 x i8], [1000 x i8], [1000 x i8], i32, i32, i32, i32, i32, i32, i32, i32 } %struct.wobble.1 = type { i32, [2 x i32], i32, i32, %struct.wobble.1, %struct.wobble.1, i32, [2 x [4 x [4 x [2 x i32]]]], i32, i64, i64, i32, i32, [4 x i8], [4 x i8], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } %struct.quux.2 = type { i32, i32, i32, i32, i32, %struct.quux.2* } %struct.zot.3 = type { i64, i16, i16, i16 } define void @blam(%struct.widget* %arg, i32 %arg1) local_unnamed_addr { bb: %tmp = load i32, i32* undef, align 4 %tmp2 = sdiv i32 %tmp, 6 %tmp3 = sdiv i32 undef, 6 %tmp4 = load i32, i32* undef, align 4 %tmp5 = icmp eq i32 %tmp4, 4 %tmp6 = select i1 %tmp5, i32 %tmp3, i32 %tmp2 %tmp7 = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* undef, i64 0, i64 0, i64 0 %tmp8 = zext i16 undef to i32 %tmp9 = zext i16 undef to i32 %tmp10 = load i16, i16* undef, align 2 %tmp11 = zext i16 %tmp10 to i32 %tmp12 = zext i16 undef to i32 %tmp13 = zext i16 undef to i32 %tmp14 = zext i16 undef to i32 %tmp15 = load i16, i16* undef, align 2 %tmp16 = zext i16 %tmp15 to i32 %tmp17 = zext i16 undef to i32 %tmp18 = sub nsw i32 %tmp8, %tmp9 %tmp19 = shl nsw i32 undef, 1 %tmp20 = add nsw i32 %tmp19, %tmp18 %tmp21 = sub nsw i32 %tmp11, %tmp12 %tmp22 = shl nsw i32 undef, 1 %tmp23 = add nsw i32 %tmp22, %tmp21 %tmp24 = sub nsw i32 %tmp13, %tmp14 %tmp25 = shl nsw i32 undef, 1 %tmp26 = add nsw i32 %tmp25, %tmp24 %tmp27 = sub nsw i32 %tmp16, %tmp17 %tmp28 = shl nsw i32 undef, 1 %tmp29 = add nsw i32 %tmp28, %tmp27 %tmp30 = sub nsw i32 %tmp20, %tmp29 %tmp31 = sub nsw i32 %tmp23, %tmp26 %tmp32 = shl nsw i32 %tmp30, 1 %tmp33 = add nsw i32 %tmp32, %tmp31 store i32 %tmp33, i32* undef, align 4 %tmp34 = mul nsw i32 %tmp31, -2 %tmp35 = add nsw i32 %tmp34, %tmp30 store i32 %tmp35, i32* undef, align 4 %tmp36 = select i1 %tmp5, i32 undef, i32 undef br label %bb37 bb37: ; preds = %bb %tmp38 = load i32, i32* undef, align 4 %tmp39 = ashr i32 %tmp38, %tmp6 %tmp40 = load i32, i32* undef, align 4 %tmp41 = sdiv i32 %tmp39, %tmp40 store i32 %tmp41, i32* undef, align 4 ret void }	2021-06-12 11:41:38 +01:00
spupyrev	0a0800c4d1	A post-processing for BFI inference The current implementation for computing relative block frequencies does not handle correctly control-flow graphs containing irreducible loops. This results in suboptimally generated binaries, whose perf can be up to 5% worse than optimal. To resolve the problem, we apply a post-processing step, which iteratively updates block frequencies based on the frequencies of their predesessors. This corresponds to finding the stationary point of the Markov chain by an iterative method aka "PageRank computation". The algorithm takes at most O(\|E\| * IterativeBFIMaxIterations) steps but typically converges faster. It is turned on by passing option `use-iterative-bfi-inference` and applied only for functions containing profile data and irreducible loops. Tested on SPEC06/17, where it is helping to get correct profile counts for one of the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5% for binaries containing functions with hot irreducible loops. Reviewed By: hoy, wenlei, davidxl Differential Revision: https://reviews.llvm.org/D103289	2021-06-11 21:46:04 -07:00
Adrian Prantl	4fc93a3a1f	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previsously reverted patch with support for platforms where signposts are unavailable. Differential Revision: https://reviews.llvm.org/D103575	2021-06-11 16:52:34 -07:00
Adrian Prantl	b90f9bea96	Revert "Allow signposts to take advantage of deferred string substitution" I forgot to make the LLDB macro conditional on Linux. This reverts commit `541ccd1c1b`.	2021-06-11 16:46:34 -07:00
Andrew Litteken	f6dea2e732	[IRSim] Strip out the findSimilarity call from the constructor Both doInitialize and runOnModule were running the entire analysis due to the actual work being done in the constructor. Strip it out here and only get the similarity during runOnModule. Author: lanza Reviewers: AndrewLitteken, paquette, plofti Differential Revision: https://reviews.llvm.org/D92524	2021-06-11 18:41:28 -05:00
Adrian Prantl	541ccd1c1b	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. Differential Revision: https://reviews.llvm.org/D103575	2021-06-11 16:35:43 -07:00
Daniil Fukalov	79ffbc9c9f	[NFC][CostModel] Fixed comment that comparisons work regardless of the state. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D104068	2021-06-11 23:48:49 +03:00
Kevin Athey	e0b469ffa1	[clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang. Also: - add driver test (fsanitize-use-after-return.c) - add basic IR test (asan-use-after-return.cpp) - (NFC) cleaned up logic for generating table of __asan_stack_malloc depending on flag. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D104076	2021-06-11 12:07:35 -07:00
Matt Arsenault	93f3c7cc3e	CodeGen: Fix missing const	2021-06-11 13:45:24 -04:00
Simon Pilgrim	0fc4016d91	APInt.h - add missing <utility> header. Some buildbots are complaining about std::move() after rG61cdaf66fe22be2b5942ddee4f46a998b4f3ee29	2021-06-11 13:35:12 +01:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Fraser Cormack	71a02ddda1	[VP][NFC] Format comment to 80 columns	2021-06-11 12:53:48 +01:00
Bing1 Yu	56d5c46b49	[X86] Support __tile_stream_loadd intrinsic for new AMX interface Adding support for __tile_stream_loadd intrinsic. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D103784	2021-06-11 17:28:43 +08:00
Simon Pilgrim	f0a68bbc96	SampleProf.h - fix spelling mistake in assert message. NFC.	2021-06-11 10:24:14 +01:00
Simon Pilgrim	5e6bfb661e	[Analysis] Pass RecurrenceDescriptor as const reference. NFCI. We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.). Differential Revision: https://reviews.llvm.org/D104029	2021-06-11 10:24:14 +01:00
Sjoerd Meijer	c4a0969b9c	Function Specialization Pass This adds a function specialization pass to LLVM. Constant parameters like function pointers and constant globals are propagated to the callee by specializing the function. This is a first version with a number of limitations: - The pass is off by default, so needs to be enabled on the command line, - It does not handle specialization of recursive functions, - It does not yet handle constants and constant ranges, - Only 1 argument per function is specialised, - The cost-model could be further looked into, and perhaps related, - We are not yet caching analysis results. This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan. More recently this was also discussed on the list, see: https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html. The motivation for this work is that function specialisation often comes up as a reason for performance differences of generated code between LLVM and GCC, which has this enabled by default from optimisation level -O3 and up. And while this certainly helps a few cpu benchmark cases, this also triggers in real world codes and is thus a generally useful transformation to have in LLVM. Function specialisation has great potential to increase compile-times and code-size. The summary from some investigations with this patch is: - Compile-time increases for short compile jobs is high relatively, but the increase in absolute numbers still low. - For longer compile-jobs, the extra compile time is around 1%, and very much in line with GCC. - It is difficult to blame one thing for compile-time increases: it looks like everywhere a little bit more time is spent processing more functions and instructions. - But the function specialisation pass itself is not very expensive; it doesn't show up very high in the profile of the optimisation passes. The goal of this work is to reach parity with GCC which means that eventually we would like to get this enabled by default. But first we would like to address some of the limitations before that. Differential Revision: https://reviews.llvm.org/D93838	2021-06-11 09:11:29 +01:00
Carl Ritson	2c2d2922a2	[ValueTypes] Define MVTs for v6i32, v6f32, v7i32, v7f32 For use in AMDGPU selection DAG. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103881	2021-06-11 08:58:16 +09:00
Nick Desaulniers	fc018ebb60	[IR] make -warn-frame-size into a module attr -Wframe-larger-than= is an interesting warning; we can't know the frame size until PrologueEpilogueInsertion (PEI); very late in the compilation pipeline. -Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO and needed to be re-specified via -plugin-opt. Instead, make it part of the IR proper as a module level attribute, similar to D103048. Introduce -fwarn-stack-size CC1 option. Reviewed By: rsmith, qcolombet Differential Revision: https://reviews.llvm.org/D103928	2021-06-10 16:15:27 -07:00
Jessica Paquette	933df6ca79	[AArch64][GlobalISel] Legalize scalar G_CTTZ + G_CTTZ_ZERO_UNDEF This adds legalization for scalar G_CTTZ and G_CTTZ_ZERO_UNDEF. Vector support requires handling vector G_BITREVERSE, which I haven't gotten around to yet. For G_CTTZ_ZERO_UNDEF, we just lower it to G_CTTZ. For G_CTTZ, we match SelectionDAG's lowering to a G_BITREVERSE + G_CTLZ. e.g. https://godbolt.org/z/nPEseYh1s (With this patch, we have slightly worse codegen than SDAG for types smaller than s32; it seems like we're missing a combine.) Also, this adds in a function to build G_BITREVERSE to MachineIRBuilder. Differential Revision: https://reviews.llvm.org/D104065	2021-06-10 15:29:51 -07:00
Joachim Meyer	4f01122c3f	[LV] Parallel annotated loop does not imply all loads can be hoisted. As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads. This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL. The question remains why this was initially added and what the implications of removing this optimization would be. Do we need an alternative mechanism to propagate the information about legality of if-conversion? Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general? I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous. Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103907	2021-06-10 23:37:57 +02:00
Philip Reames	7629b2a09c	[LI] Add a cover function for checking if a loop is mustprogress [nfc] Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.	2021-06-10 13:37:32 -07:00
Philip Reames	b6ee5f2b1d	Move code for checking loop metadata into Analysis [nfc] I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.	2021-06-10 13:01:22 -07:00
Michael Kruse	a22236120f	[OpenMP] Implement '#pragma omp unroll'. Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations). Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99459	2021-06-10 14:30:17 -05:00
Guillaume Chatelet	e0569033e2	[llvm] Make Sequence reverse-iterable This is a roll forward of D102679. This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse. It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++(). Note: Compared to D102679, this patch introduces a `asSmallVector()` member function and fixes compilation issue with GCC 5. Differential Revision: https://reviews.llvm.org/D103948	2021-06-10 11:15:28 +00:00
Esme-Yi	ec43c1213a	[NFC][XCOFF] Replace structs FileHeader32/SectionHeader32 with constants. Summary: Some structs like FileHeader32/SectionHeader32 defined in llvm/include/llvm/BinaryFormat/XCOFF.h seem unnecessary, because we only need their size. So this patch removes them and defines size constants directly. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D103901	2021-06-10 11:10:45 +00:00
David Spickett	64de8763aa	Revert "Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `31859f896c`. Causing SVE and RISCV-V test failures on bots.	2021-06-10 10:11:17 +00:00
Simon Pilgrim	4eb47e3cd4	[TargetLowering] getABIAlignmentForCallingConv - pass DataLayout by const reference. NFCI. Avoid unnecessary copies and match every other method in TargetLowering that takes DataLayout as an argument.	2021-06-10 10:55:24 +01:00
Paulo Matos	31859f896c	Implementation of global.get/set for reftypes in LLVM IR This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D95425	2021-06-10 10:07:45 +02:00
Esme-Yi	c8e980ab4a	[XCOFF][llvm-objdump] Dump the debug type in `--section-headers` option. Summary: Add XCOFF recognition of debug section types under `--section-headers` option. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D103079	2021-06-10 07:08:23 +00:00
Esme-Yi	8a23f74eb7	[llvm-objdump][XCOFF] Enable the -l (--line-numbers) option. Summary: Add support for dumping line number information for XCOFF object files in llvm-objdump. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D101272	2021-06-10 04:37:06 +00:00
Sam Powell	5b5ab80e31	Reland "[llvm] llvm-tapi-diff" This is relanding commit `d1d36f7ad2` . This patch additionally addresses failures found in buildbots due to unstable build ordering & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 21:17:34 -07:00
Jinsong Ji	4a89ed373c	[AIX] Add traceback ssp canary bit support We will need to set the ssp canary bit in traceback table to communicate with unwinder about the canary. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D103202	2021-06-10 02:40:02 +00:00
Justin Lebar	c962491a41	Save/restore OuterTemplateParams in AbstractManglingParser::parseEncoding. Previously we were only saving plain TemplateParams. Differential Revision: https://reviews.llvm.org/D103996	2021-06-09 17:56:23 -07:00
Cyndy Ishida	e7b755ecb1	Revert "Reland "[llvm] llvm-tapi-diff"" This reverts commit `20126c9fd4`. The sorting fixes failed to have stable output on different platforms.	2021-06-09 13:48:09 -07:00
Sam Powell	20126c9fd4	Reland "[llvm] llvm-tapi-diff" This is relanding commit `d1d36f7ad2` . This patch additionally addresses failures found in buildbots & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 10:35:41 -07:00
Fraser Cormack	502edebd9d	[ValueTypes][RISCV] Cap RVV fixed-length vectors by size This patch changes RVV's policy for its supported list of fixed-length vector types by capping by vector size rather than element count. Now all 1024-byte vectors (of supported element types) are supported, rather than all 256-element vectors. This is a more natural fit for the architecture, and allows us to, for example, improve the support for vector bitcasts. This change necessitated the adding of some new simple types to avoid "regressing" on the number of currently-supported vectors. We round out the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and `v512f16`. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103884	2021-06-09 12:15:37 +01:00
Florian Hahn	e978f6bc97	[LTO] Support new PM in ThinLTOCodeGenerator. This patch adds initial support for using the new pass manager when doing ThinLTO via libLTO. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D102627	2021-06-09 10:05:14 +01:00
Jan Kratochvil	09ac4eca66	Revert "[llvm] Sync DebugInfo.h with DebugInfoFlags.def" This reverts commit `093750dd0b`. It broke buildbots, goint to investigate it more.	2021-06-09 10:39:57 +02:00
Jan Kratochvil	093750dd0b	[llvm] Sync DebugInfo.h with DebugInfoFlags.def Command to see the differences: diff -u <(sed -n 's#^HANDLE_DI_FLAG ([^,], $[^()]$) $//.$\?$#\1#p' <llvm/include/llvm/IR/DebugInfoFlags.def \| grep -vw Largest) <(sed -n 's#^ LLVMDIFlag$[^ ]$ = (\?[0-9].$#\1#p' <llvm/include/llvm-c/DebugInfo.h) OCaml binding is more seriously out of sync but I have not tried to sync it. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D103910	2021-06-09 10:11:23 +02:00
Guillaume Chatelet	20f571dbff	[NFC] Reformat MachineValueType This is a follow up patch based on https://reviews.llvm.org/D103251#2804016. Differential Revision: https://reviews.llvm.org/D103893	2021-06-09 07:20:51 +00:00
Sterling Augustine	e11b5b87be	Add Twine support for std::string_view. With Twine now ubiquitous after rG92a79dbe91413f685ab19295fc7a6297dbd6c824, it needs support for string_view when building clang with newer C++ standards. This is similar to how StringRef is handled. Differential Revision: https://reviews.llvm.org/D103935	2021-06-08 20:19:04 -07:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit `211e584fa2`. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Quinn Pham	898e38a3c1	[NFC] In the future, all intrinsics defined for compatibility with the XL compiler will be placed in this collection. This patch has no functional changes. Differential revision: https://reviews.llvm.org/D103921	2021-06-08 17:58:02 -05:00
Whitney Tsang	9b022a679b	Revert "Revert "[LoopNest] Fix Wdeprecated-copy warnings"" This reverts commit `07ef5805ab`. The broke of the sanitizer-windows bot: https://lab.llvm.org/buildbot/#/builders/127/builds/12064 is not caused by the original commit. Differential Revision: https://reviews.llvm.org/D103752	2021-06-08 21:51:53 +00:00
Whitney Tsang	07ef5805ab	Revert "[LoopNest] Fix Wdeprecated-copy warnings" This reverts commit `dee1f0cb34`. It appears that this change broke the sanitizer-windows bot: https://lab.llvm.org/buildbot/#/builders/127/builds/12064 Differential Revision: https://reviews.llvm.org/D103752	2021-06-08 20:46:12 +00:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit `ea10a86984`. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Abhina Sreeskantharajan	0e8506deba	[SystemZ][z/OS] Pass OpenFlags when creating tmp files This patch https://reviews.llvm.org/D102876 caused some lit regressions on z/OS because tmp files were no longer being opened based on binary/text mode. This patch passes OpenFlags when creating tmp files so we can open files in different modes. Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D103806	2021-06-08 14:45:34 -04:00
Nick Desaulniers	3787ee4571	reland [IR] make -stack-alignment= into a module attr Relands commit `433c8d950c` with fixes for MIPS. Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 10:59:46 -07:00
Mehdi Amini	a4e2cf712a	Revert "[llvm] Make Sequence reverse-iterable" This reverts commit `e772216e70` (and fixup `7f6c878a2c`). The build is broken with gcc5 host compiler: In file included from from mlir/lib/Dialect/Utils/StructuredOpsUtils.cpp:9: tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: error: type/value mismatch at argument 1 in template parameter list for 'template<class ItTy, class FuncTy, class FuncReturnTy> class llvm::mapped_iterator' std::function<T(ptrdiff_t)>>; ^ tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: note: expected a type, got 'decltype (seq<ptrdiff_t>(0, 0))::const_iterator'	2021-06-08 17:03:10 +00:00
Brendon Cahoon	ea10a86984	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
Nick Desaulniers	a596b54d47	Revert "[IR] make -stack-alignment= into a module attr" This reverts commit `433c8d950c`. Breaks the MIPS build.	2021-06-08 08:55:50 -07:00
Nick Desaulniers	433c8d950c	[IR] make -stack-alignment= into a module attr Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 08:31:04 -07:00
Whitney Tsang	dee1f0cb34	[LoopNest] Fix Wdeprecated-copy warnings error: definition of implicit copy constructor for 'LoopNest' is deprecated because it has a user-declared copy assignment operator [-Werror,-Wdeprecated-copy] LoopNest &operator=(const LoopNest &) = delete; Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D103752	2021-06-08 14:48:13 +00:00
Guillaume Chatelet	7f6c878a2c	Fix missing header and namespace qualifier in ADT Sequence	2021-06-08 14:11:54 +00:00
Guillaume Chatelet	e772216e70	[llvm] Make Sequence reverse-iterable This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse. It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++(). Differential Revision: https://reviews.llvm.org/D102679	2021-06-08 13:18:57 +00:00
Hans Wennborg	386b66b2fc	Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" > This reapplies `c0f3dfb9`, which was reverted following the discovery of > crashes on linux kernel and chromium builds - these issues have since > been fixed, allowing this patch to re-land. This reverts commit `36ec97f76a`. The change caused non-determinism in the compiler, see comments on the code review at https://reviews.llvm.org/D91722. Reverting to unbreak people's builds until that can be addressed. This also reverts the follow-up "[DebugInfo] Limit the number of values that may be referenced by a dbg.value" in `a0bd6105d8`.	2021-06-08 14:54:08 +02:00
Simon Moll	0f9d299122	[VP] getDeclarationForParams `VPIntrinsic::getDeclarationForParams` creates a vp intrinsic declaration for parameters you want to call it with. This is in preparation of a new builder class that makes emitting vp intrinsic code nearly as convenient as using a plain ir builder (aka `VectorBuilder`, to be used by D99750). Reviewed By: frasercrmck, craig.topper, vkmr Differential Revision: https://reviews.llvm.org/D102686	2021-06-08 14:21:28 +02:00
Timm Bäder	22875b2ce3	[NFC] Remove some include cycles These files include themselves directly.	2021-06-08 14:00:39 +02:00
maekawatoshiki	09e92c607c	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Also, a crash problem on legacy pass manager is fixed. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-06-08 20:30:02 +09:00
Lang Hames	4f16ccdab2	[JITLink] Clarify LinkGraph::splitBlock contract in comment.	2021-06-08 18:51:12 +10:00
Tomasz Miąsko	82b7e822d0	[Demangle][Rust] Parse path backreferences Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103459	2021-06-08 10:01:49 +02:00
hsmahesha	3af5f3e692	[IR] Add utility to convert constant expression operands (of an instruction) to instructions. In the situation where we need to replace a constant operand C from a constant expression CE by an instruction NI, it not possible without converting CE itself into an instruction. This utility helps to convert the given set of constant expression operands from an instruction I into a corresponding set of instructions. The current use-case for this utility is from the patches - https://reviews.llvm.org/D103225 and https://reviews.llvm.org/D103655. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103661	2021-06-08 03:22:32 +05:30
Amir Ayupov	8ec73e96b7	[ELF] getRelocatedSection: remove the check for ET_REL object file getRelocatedSection interface should not check that the object file is relocatable, as executable files may have relocations preserved with `--emit-relocs` linker flag. The relocations are useful in context of post-link binary analysis for function reference identification. For example, BOLT relies on relocations to perform function reordering. Reviewed By: MaskRay, jhenderson Differential Revision: https://reviews.llvm.org/D102296	2021-06-07 13:17:00 -07:00
Harald van Dijk	75521bd9d8	[X32] Add Triple::isX32(), use it. So far, support for x86_64-linux-gnux32 has been handled by explicit comparisons of Triple.getEnvironment() to GNUX32. This worked as long as x86_64-linux-gnux32 was the only X32 environment to worry about, but we now have x86_64-linux-muslx32 as well. To support this, this change adds an isX32() function and uses it. It replaces all checks for GNUX32 or MuslX32 by isX32(), except for the following: - Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat GNUX32 and MuslX32 differently. - computeTargetTriple() needs to be able to transform triples to add or remove X32 from the environment and needs to map GNU to GNUX32, and Musl to MuslX32. - getMultiarchTriple() completely lacks any Musl support and retains the explicit check for GNUX32 as it can only return x86_64-linux-gnux32. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D103777	2021-06-07 20:48:39 +01:00
Philip Reames	38540d71c7	[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be: for (unsigned i = 0; i < N; i += 2) { body; } Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body". The basic intuition behind this patch is as follows: * A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max. * Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute. Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop. A couple notes for those reading along: * I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules. * We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path. Differential Revision: https://reviews.llvm.org/D103118	2021-06-07 11:24:00 -07:00
jasonliu	8e84311a84	[XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing Add in the ability of parsing symbol table for 64 bit object. Reviewed By: jhenderson, DiggerLin Differential Revision: https://reviews.llvm.org/D85774	2021-06-07 17:24:13 +00:00
Raphael Isemann	f10b9ca9c6	[NFC] Add missing include to LaneBitmask.h to fix modules build	2021-06-07 18:43:00 +02:00
Sander de Smalen	c908196e10	[CostModel] Return Invalid cost in getArithmeticCost instead of crashing for scalable vectors. This fixes an issue in BasicTTIImpl.h where it tries to do a cast<FixedVectorType> on a scalable vector type in order to get the scalarization cost. Because scalarization of scalable vectors is not supported, we return Invalid instead. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103798	2021-06-07 17:26:23 +01:00
Tomasz Miąsko	619a65e5e4	[Demangle][Rust] Parse dyn-trait-assoc-binding Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103364	2021-06-07 18:18:31 +02:00
Tomasz Miąsko	1499afa09b	[Demangle][Rust] Parse dyn-trait Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103361	2021-06-07 18:18:31 +02:00
Tomasz Miąsko	89615a5e92	[Demangle][Rust] Parse dyn-bounds Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103151	2021-06-07 18:18:30 +02:00
Bradley Smith	60c9b5f35c	[AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics Use llvm.experimental.vector.insert instead of storing into an alloca when generating code for these intrinsics. This defers the codegen of the generated vector to instruction selection, allowing existing shufflevector style optimizations to apply. Additionally, introduce a new target transform that can recognise fixed predicate patterns in the svbool variants of these intrinsics. Differential Revision: https://reviews.llvm.org/D103082	2021-06-07 12:21:38 +01:00
Guillaume Chatelet	1da2c7d25c	[NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE Differential Revision: https://reviews.llvm.org/D103251	2021-06-07 10:04:16 +00:00
Jingu Kang	a2a0ac42ab	[SimpleLoopBoundSplit] Split Bound of Loop which has conditional branch with IV This pass transforms loops that contain a conditional branch with induction variable. For example, it transforms left code to right code: newbound = min(n, c) while (iv < n) { while(iv < newbound) { A A if (iv < c) B B C C } } if (iv != n) { while (iv < n) { A C } } Differential Revision: https://reviews.llvm.org/D102234	2021-06-07 10:55:25 +01:00
Esme-Yi	50bb1b930d	[yaml2obj] Initial the support of yaml2obj for 32-bit XCOFF. Summary: The patch implements the mapping of the Yaml information to XCOFF object file to enable the yaml2obj tool for XCOFF. Currently only 32-bit is supported. Reviewed By: jhenderson, shchenz Differential Revision: https://reviews.llvm.org/D95505	2021-06-07 04:14:44 +00:00
maekawatoshiki	0a9d079931	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit `2165360003`. To fix the crash problem in legacy pass manager	2021-06-07 01:26:47 +09:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Nikita Popov	506875c879	[TargetLowering] Move methods out of line (NFC) Move methods using IRBuilder out of line, so we can drop the dependency on the header.	2021-06-06 16:02:10 +02:00
Simon Pilgrim	5fc8cdcb03	BreadthFirstIterator.h - fix uninitialized variable warning in default constructor. NFCI.	2021-06-06 14:13:08 +01:00
Simon Pilgrim	6e90192fdf	PatternMatch.h - wrap WrapFlags tests inside brackets to stop static analysis warning about & vs && usage. NFCI.	2021-06-06 13:38:39 +01:00
Simon Pilgrim	139a36454f	SmallVector.h - remove unused MathExtras.h header. NFCI.	2021-06-06 12:05:39 +01:00
Simon Pilgrim	c18df1e156	Fix uninitialized variable warnings. NFCI.	2021-06-06 11:09:55 +01:00
Simon Pilgrim	b47a7bb703	Revert rG0b18c4c0ec03f0321ee83b9976da5777d0e4f53f "SmallVector.h - remove unused MathExtras.h header (REAPPLIED). NFCI." Buildbots still seem to find implicit header dependencies that I can't locally....	2021-06-06 09:39:19 +01:00
Simon Pilgrim	0b18c4c0ec	SmallVector.h - remove unused MathExtras.h header (REAPPLIED). NFCI. Try again to remove this header - I think I've found the implicit dependencies (mainly for <cmath>) on linux builds now.	2021-06-06 09:30:15 +01:00
Simon Pilgrim	e3ae4ce66e	Revert rG7b839b3542983a313a9bf9f8d8039ceeea35c4d7 - "SmallVector.h - remove unused MathExtras.h header. NFCI." Breaks on linux buildbots as I seem to have missed some implicit header dependencies....	2021-06-05 20:59:46 +01:00
Simon Pilgrim	7b839b3542	SmallVector.h - remove unused MathExtras.h header. NFCI.	2021-06-05 20:19:58 +01:00
Simon Pilgrim	fe6c45dd27	BitstreamWriter.h - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 19:20:14 +01:00
Simon Pilgrim	128f5d16ef	ELFTypes.h - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 19:11:40 +01:00
Simon Pilgrim	d118fa2914	[MCA] Support.h - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 19:10:49 +01:00
Simon Pilgrim	6ebb28d32e	EndianStream.h - add missing implicit MathExtras.h header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h	2021-06-05 18:05:40 +01:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Nikita Popov	db45746821	[LoopUnroll] Separate peeling from unrolling Loop peeling is currently performed as part of UnrollLoop(). Outside test scenarios, it is always performed with an unroll count of 1. This means that unrolling doesn't actually do anything apart from performing post-unroll simplification. When testing, it's currently possible to specify both an explicit peel count and an explicit unroll count. This doesn't perform any sensible operation and may result in miscompiles, see https://bugs.llvm.org/show_bug.cgi?id=45939. This patch moves peeling from UnrollLoop() into tryToUnrollLoop(), so that peeling does not also perform a susequent unroll. We only run the post-unroll simplifications. Specifying both an explicit peel count and unroll count is forbidden. In the future, we may want to support both (non-PGO) peeling a loop and unrolling it, but this needs to be done by first performing the peel and then recalculating unrolling heuristics on a now possibly analyzable loop. Differential Revision: https://reviews.llvm.org/D103362	2021-06-05 10:32:00 +02:00
Amir Ayupov	065a9316aa	[MC] Add getLSDASection interface This diff adds getLSDASection method to MCObjectFileInfo. Test plan: make check-all Differential revision: https://reviews.llvm.org/D102298	2021-06-05 00:28:20 -07:00
Rong Xu	8d581857d7	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Joseph Huber	4a08163c73	[Attributor] Check HeapToStack's state for isKnownHeapToStack This patch changes the `isKnownHeapToStack` and `isAssumedHeapToStack` member functions to return if a function call is going to be altered by HeapToStack. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103574	2021-06-04 12:38:33 -04:00
Joseph Huber	8bb713207d	[Attributor] Allow lookupAAFor to return null on invalid state This patch adds an option to `lookupAAFor` that allows it to return a nullptr if the state of the looked up attribute is invalid. This is so future passes can use this to query other attributes with the guarantee that they are valid. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103556	2021-06-04 12:29:15 -04:00
Alexey Bataev	c84a5448b5	[OPENMP]Fix PR50129: omp cancel parallel not working as expected. Need to emit a call for __kmpc_cancel_barrier in the exit block for __kmpc_cancel function call if cancellation of the parallel block is requested. Differential Revision: https://reviews.llvm.org/D103646	2021-06-04 08:24:55 -07:00
Mirko Brkusanin	35ef4c940b	[AMDGPU][GlobalISel] Legalize G_ABS Legalize and select G_ABS so that we can use llvm.abs intrinsic Differential Revision: https://reviews.llvm.org/D102391	2021-06-04 14:46:43 +02:00
Cyndy Ishida	5337c7550d	Revert "[llvm] llvm-tapi-diff" This reverts commit `d1d36f7ad2`. Reverting this patch to investigate linux bot failures + fix with author offline	2021-06-03 21:10:51 -07:00
Arthur Eubanks	738abfdbea	[NFC] Remove checking pointee type for byval/preallocated type These currently always require a type parameter. The bitcode reader already upgrades old bitcode without the type parameter to use the pointee type.	2021-06-03 19:09:09 -07:00
zero9178	619fa0d7fc	[NFC] Add missing includes for LLVM_ENABLE_MODULES builds Building LLVM with the LLVM_ENABLE_MODULES cmake option fails when the modules are being compiled due to missing includes. This is a side effect of some transitive includes that changed recently. Differential Revision: https://reviews.llvm.org/D103645	2021-06-03 23:29:03 +02:00
Philip Reames	5c0d1b2f90	[LoopUnroll] Eliminate PreserveCondBr parameter and fix a bug in the process This builds on D103584. The change eliminates the coupling between unroll heuristic and implementation w.r.t. knowing when the passed in trip count is an exact trip count or a max trip count. In theory the new code is slightly less powerful (since it relies on exact computable trip counts), but in practice, it appears to cover all the same cases. It can also be extended if needed. The test change shows what appears to be a bug in the existing code around the interaction of peeling and unrolling. The original loop only ran 8 iterations. The previous output had the loop peeled by 2, and then an exact unroll of 8. This meant the loop ran a total of 10 iterations which appears to have been a miscompile. Differential Revision: https://reviews.llvm.org/D103620	2021-06-03 14:09:16 -07:00
Sam Powell	d1d36f7ad2	[llvm] llvm-tapi-diff This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-03 11:38:00 -07:00
Eli Friedman	44cdf771fe	[AtomicExpand] Merge cmpxchg success and failure ordering when appropriate. If we're not emitting separate fences for the success/failure cases, we need to pass the merged ordering to the target so it can emit the correct instructions. For the PowerPC testcase, we end up with extra fences, but that seems like an improvement over missing fences. If someone wants to improve that, the PowerPC backed could be taught to emit the fences after isel, instead of depending on fences emitted by AtomicExpand. Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 . Differential Revision: https://reviews.llvm.org/D103342	2021-06-03 11:34:35 -07:00
Artur Pilipenko	5a2aec3f27	NFC. Mark DOTFuncInfo getters as const This is a preparatory refactoring for introducing new types of hidden blocks.	2021-06-03 11:27:06 -07:00
Artur Pilipenko	a06e63fa52	NFC. Refactor DOTGraphTraits::isNodeHidden Restructure handling of cfg-hide-unreachable-paths and cfg-hide-deoptimize-paths options so as to make it easier to introduce new types of hidden blocks.	2021-06-03 11:27:06 -07:00
Philip Reames	44d70d298a	[LoopUnroll] Eliminate PreserveOnlyFirst parameter [nfc] This is a first step towards simplifying the transform interface to be less error prone. The basic idea is that querying SCEV is cheap (since it's cached) and we can just check for properties related to branch folding in the transform method instead of relying on the heuristic part to pass everything in correctly. Differential Revision: https://reviews.llvm.org/D103584	2021-06-03 10:33:14 -07:00
Nikita Popov	b0ab79ee2d	[MC] Add missing include (NFC) Try to fix buildbots after `983565a6fe`.	2021-06-03 18:50:00 +02:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
David Spickett	f4543dce5d	[clang][ARM] Remove arm2/3/6/7m CPU names These legacy CPUs are known to clang but not llvm. Their use was ignored by llvm and it would print a warning saying it did not recognise them. However because some of them are default CPUs for their architecture, you would get those warnings even if you didn't choose a cpu explicitly. (now those architectures will default to a "generic" CPU) Information is thin on the ground for these older chips so this is the best I could find: https://en.wikichip.org/wiki/acorn/microarchitectures/arm2 https://en.wikichip.org/wiki/acorn/microarchitectures/arm3 https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm6 https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm7 Final part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103028	2021-06-03 08:55:44 +00:00
Min-Yih Hsu	344e919b1a	[CodeGen][NFC] Remove unused virtual function `TargetFrameLowering::emitCalleeSavedFrameMoves` with 4 arguments is not used anywhere in CodeGen. Thus it shouldn't be exposed as a virtual function. NFC. Differential Revision: https://reviews.llvm.org/D103328	2021-06-02 13:11:12 -07:00
Stefan Pintilie	1ed2e9b9a0	[NFC] Remove variable that was set but not used. The buildbot ppc64le-lld-multistage-test has been failing because the variable Tag in Waymaking.h is set but not used. This patch removes that varaible.	2021-06-02 13:20:32 -05:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Qunyan Mangus	cbde248736	Add getDemandedBits for uses. Add getDemandedBits method for uses so we can query demanded bits for each use. This can help getting better use information. For example, for the code below define i32 @test_use(i32 %a) { %1 = and i32 %a, -256 %2 = or i32 %1, 1 %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose) ... some use of %3 ret %2 } if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97074	2021-06-02 10:07:40 -04:00
Sander de Smalen	d41cb6bb26	[LV] Build and cost VPlans for scalable VFs. This patch uses the calculated maximum scalable VFs to build VPlans, cost them and select a suitable scalable VF. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98722	2021-06-02 14:47:47 +01:00
Sean Fertile	81f7607f7c	[PowerPC][AIX} FIx AIX bootstrap build. A recent patch: https://reviews.llvm.org/rGe0921655b1ff8d4ba7c14be59252fe05b705920e changed clangs AIX bitfield handling to use 4-byte bitfield containers, matching XLs behavior. This change triggers static assert failures when bootstrapping. Change the macro we check to enable bitfield packing on AIX to `__clang__` which is defined by both xlclang and clang. Differential Revision: https://reviews.llvm.org/D103474	2021-06-02 09:31:11 -04:00
Daniil Fukalov	0195e594fe	[TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102915	2021-06-02 16:04:11 +03:00
Bjorn Pettersson	536e02a23c	[CodeGen] Refactor libcall lookups for RTLIB::POWI_* Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type. This includes a small refactoring of ExpandFPLibCall and ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code, plus adding an ExpandFPLibCall version that can be called directly when expanding FPOWI/STRICT_FPOWI to ensure that we actually use the same RTLIB::Libcall when expanding the libcall as we used when checking the legality of such a call by doing a getLibcallName check. Differential Revision: https://reviews.llvm.org/D103050	2021-06-02 11:40:34 +02:00
Bjorn Pettersson	9c54ee4378	[SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf When rewriting powf(2.0, itofp(x)) -> ldexpf(1.0, x) exp2(sitofp(x)) -> ldexp(1.0, sext(x)) exp2(uitofp(x)) -> ldexp(1.0, zext(x)) the wrong type was used for the second argument in the ldexp/ldexpf libc call, for target architectures with 16 bit "int" type. The transform incorrectly used a bitcasted function pointer with a 32-bit argument when emitting the ldexp/ldexpf call for such targets. The fault is solved by using the correct function prototype in the call, by asking TargetLibraryInfo about the size of "int". TargetLibraryInfo by default derives the size of the int type by assuming that it is 16 bits for 16-bit architectures, and 32 bits otherwise. If this isn't true for a target it should be possible to override that default in the TargetLibraryInfo initializer. Differential Revision: https://reviews.llvm.org/D99438	2021-06-02 11:40:34 +02:00
Tomasz Miąsko	a67a234ec7	[Demangle][Rust] Parse binders Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102729	2021-06-02 10:36:45 +02:00
Arthur Eubanks	8961293851	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Daniel Sanders	9372662050	fixup: Missing operator in [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one My local compiler was fine with it but the bots complain about ambiguous types.	2021-06-01 13:58:03 -07:00
Daniel Sanders	aaac268285	[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one It's still in use in a few places so we can't delete it yet but there's not many at this point. Differential Revision: https://reviews.llvm.org/D103352	2021-06-01 13:23:48 -07:00
Andrew Kelley	936ca1e21a	WindowsSupport.h: do not depend on private config header WindowsSupport.h is a public header, however if it gets included, will cause a compile error indicating that llvm/Config/config.h cannot be found, because config.h is a private header. However there is no actual dependency on the private things in this header, so it can be changed to the public config header. Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D103370	2021-06-01 23:05:03 +03:00
Jessica Paquette	e7f501b5e7	[GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width Also add a target hook which allows us to get around custom legalization on AArch64. Differential Revision: https://reviews.llvm.org/D99283	2021-06-01 10:56:17 -07:00
Guozhi Wei	1b748faf2b	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-06-01 10:31:30 -07:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Nikita Popov	fd7e309e02	[ADT] Move DenseMapInfo for APInt into APInt.h (PR50527) As suggested in https://bugs.llvm.org/show_bug.cgi?id=50527, this moves the DenseMapInfo for APInt and APSInt into the respective headers, removing the need to include APInt.h and APSInt.h from DenseMapInfo.h. We could probably do the same from StringRef and ArrayRef as well. Differential Revision: https://reviews.llvm.org/D103422	2021-06-01 18:31:41 +02:00
Daniil Seredkin	13140120dc	[InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y) InstCombine didn't perform the transformations when fmul's operands were the same instruction because it required to have one use for each of them which is false in the case. This patch fixes this + adds tests for them and introduces a new function isOnlyUserOfAnyOperand to check these cases in a single place. This patch is a result of discussion in D102574. Differential Revision: https://reviews.llvm.org/D102698	2021-06-01 08:33:23 -04:00
Andy Wingo	82f92e35c6	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-06-01 11:31:39 +02:00
Arthur Eubanks	372237487e	[OpaquePtr] Remove some uses of PointerType::getElementType()	2021-05-31 16:11:25 -07:00
Florian Hahn	aa00b1d763	[LV] Try to sink users recursively for first-order recurrences. Update isFirstOrderRecurrence to explore all uses of a recurrence phi and check if we can sink them. If there are multiple users to sink, they are all mapped to the previous instruction. Fixes PR44286 (and another PR or two). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D84951	2021-05-31 19:55:33 +01:00
Andrea Di Biagio	9853d0db1e	[MCA][NFCI] Minor changes to InstrBuilder and Instruction. This is based on the assumption that most simulated instructions don't define more than one or two registers. This is true for example on x86, where most instruction definitions don't declare more than one register write. The default code region size has been increased from 8 to 16. This is based on the assumption that, for small microbenchmarks, the typical code snippet size is often less than 16 instructions. mca::Instruction now uses bitfields to pack flags. No functional change intended.	2021-05-31 17:05:13 +01:00
Daniil Fukalov	e853d3b274	[NFC] MemoryDependenceAnalysis cleanup. 1. Removed redundant includes, 2. Removed never defined and used `releaseMemory()`. 3. Fixed member functions names first letter case. 4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name `NonLocalDeps` to `NonLocalDepsMap`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102358	2021-05-31 18:07:55 +03:00
Roman Lebedev	f7c95c3322	[NFC] ScalarEvolution: apply SSO to the ExprValueMap value ExprValueMap is a map from SCEV * to a set-vector of (Value , ConstantInt ) pair, and while the map itself will likely be big-ish (have many keys), it is a reasonable assumption that each key will refer to a small-ish number of pairs. In particular looking at n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384, the small-size of 4 appears to be the sweet spot, it results in the least allocations while minimizing memory footprint. ``` $ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i \| tail -n 6; echo ""; done heaptrack.opt.0-orig.gz total runtime: 14.32s. calls to allocation functions: 8222442 (574192/s) temporary memory allocations: `2419000` (168924/s) peak heap memory consumption: 190.98MB peak RSS (including heaptrack overhead): 239.65MB total memory leaked: 67.58KB heaptrack.opt.1-n1.gz total runtime: 13.72s. calls to allocation functions: 7184188 (523705/s) temporary memory allocations: 2419017 (176338/s) peak heap memory consumption: 191.38MB peak RSS (including heaptrack overhead): 239.64MB total memory leaked: 67.58KB heaptrack.opt.2-n2.gz total runtime: 12.24s. calls to allocation functions: 6146827 (502355/s) temporary memory allocations: 2418997 (197695/s) peak heap memory consumption: 163.31MB peak RSS (including heaptrack overhead): 211.01MB total memory leaked: 67.58KB heaptrack.opt.3-n4.gz total runtime: 12.28s. calls to allocation functions: 6068532 (494260/s) temporary memory allocations: 2418985 (197017/s) peak heap memory consumption: 155.43MB peak RSS (including heaptrack overhead): 201.77MB total memory leaked: 67.58KB heaptrack.opt.4-n8.gz total runtime: 12.06s. calls to allocation functions: 6068042 (503321/s) temporary memory allocations: 2418992 (200646/s) peak heap memory consumption: 166.03MB peak RSS (including heaptrack overhead): 213.55MB total memory leaked: 67.58KB heaptrack.opt.5-n16.gz total runtime: 12.14s. calls to allocation functions: 6067993 (499958/s) temporary memory allocations: 2418999 (199307/s) peak heap memory consumption: 187.24MB peak RSS (including heaptrack overhead): 233.69MB total memory leaked: 67.58KB ``` While that test may be an edge worst-case scenario, https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions agrees that this also results in improvements in the usual situations.	2021-05-31 15:34:03 +03:00
Andy Wingo	bc1ad6e3c4	Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables" This reverts commit `bf35f4af51`. There was an error in a shared-library build.	2021-05-31 10:55:15 +02:00
Andy Wingo	bf35f4af51	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-05-31 10:40:38 +02:00
Sanjay Patel	7bb8bfa062	[InstCombine] fix miscompile from vector select substitution This is similar to the fix in `c590a9880d` ( PR49832 ), but we missed handling the pattern for select of bools (no compare inst). We can't substitute a vector value because the equality condition replacement that we are attempting requires that the condition is true/false for the entire value. Vector select can be partly true/false. I added an assert for vector types, so we shouldn't hit this again. Fixed formatting while auditing the callers. https://llvm.org/PR50500	2021-05-30 07:11:58 -04:00
Arthur Eubanks	3a6f12f915	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `bc7d15c61d`. Dependent change is to be reverted.	2021-05-29 22:40:33 -07:00
Fangrui Song	38dbdde792	[Internalize] Simplify comdat renaming with noduplicates after D103043 I realized that we can use `comdat noduplicates` which is available on ELF. Add a special case for wasm which doesn't support the feature.	2021-05-28 16:58:38 -07:00
Eli Friedman	0b3b0a727a	[AArch64][RISCV] Make sure isel correctly honors failure orderings. If a cmpxchg specifies acquire or seq_cst on failure, make sure we generate code consistent with that ordering even if the success ordering is not acquire/seq_cst. At one point, it was ambiguous whether this sort of construct was valid, but the C++ standad and LLVM now accept arbitrary combinations of success/failure orderings. This doesn't address the corresponding issue in AtomicExpand. (This was reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .) Fixes https://bugs.llvm.org/show_bug.cgi?id=50512. Differential Revision: https://reviews.llvm.org/D103284	2021-05-28 12:47:40 -07:00
Craig Topper	2830d924b0	[VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names. Parameter positions seem like they should be unsigned. While there, make function names lowercase per coding standards. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103224	2021-05-28 11:28:47 -07:00
eopXD	fa488ea864	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 15:43:12 +00:00
Andy Wingo	ca5f07f8c4	Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables" This reverts commit `00ecf18979`, as it broke the AMDGPU build. Will reland later with a fix.	2021-05-28 12:42:12 +02:00
Andy Wingo	00ecf18979	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-05-28 11:07:41 +02:00
eopXD	e96d6f4821	Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass" This reverts commit `7952ddb21f`. Differential Revision: https://reviews.llvm.org/D103302	2021-05-28 07:58:06 +00:00
eopXD	7e06cf8f1b	Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass" This reverts commit `ffc4d3e068`.	2021-05-28 07:48:04 +00:00
eopXD	ffc4d3e068	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 07:25:53 +00:00
eopXD	7952ddb21f	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 07:11:26 +00:00
Jordan Rupprecht	e41aaea262	[NFC][libObject] clang-format Archive{.h,.cpp} In preparation for D100651	2021-05-27 16:48:40 -07:00
Andrea Di Biagio	57646d38d5	[MCA] Minor changes to the InOrderIssueStage. NFC The constructor of InOrderIssueStage no longer takes as input a reference to the target scheduling model. The stage can always query the subtarget to obtain a reference to the scheduling model. The ResourceManager is no longer stored internally as a unique_ptr. Moved a couple of method definitions to the .cpp file.	2021-05-28 00:33:59 +01:00
Andrea Di Biagio	50770d8de5	[MCA] Refactor the InOrderIssueStage stage. NFCI Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile. Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent(). No functional change intended.	2021-05-27 22:28:04 +01:00
Quinn Pham	62b5df7fe2	[PowerPC] Added multiple PowerPC builtins This is the first in a series of patches to provide builtins for compatibility with the XL compiler. Most of the builtins already had intrinsics and only needed to be implemented in the front end. Intrinsics were created for the three iospace builtins, eieio, and icbt. Pseudo instructions were created for eieio and iospace_eieio to ensure that nops were inserted before the eieio instruction. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D102443	2021-05-27 16:23:03 -05:00
Adrian Prantl	f3869a5c32	Support stripping indirectly referenced DILocations from !llvm.loop metadata in stripDebugInfo(). This patch fixes an oversight in https://reviews.llvm.org/D96181 and also takes into account loop metadata pointing to other MDNodes that point into the debug info. rdar://78487175 Differential Revision: https://reviews.llvm.org/D103220	2021-05-27 13:23:33 -07:00
Saleem Abdulrasool	4cc5a97101	MC: mark `dump` with `LLVM_DUMP_METHOD` Mark the `ELFRelocationEntry::dump` method as `LLVM_DUMP_METHOD` to annotate it properly as used to prevent the function being dead stripped away. This allows use of `dump` in the debugger. This is purely to improve the developer experience.	2021-05-27 10:47:39 -07:00
maekawatoshiki	2165360003	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-05-28 01:17:23 +09:00
Fraser Cormack	5a80dc4988	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Mats Petersson	9091ecdae0	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 15:33:05 +01:00
Simon Giesecke	5f2d4b23b4	Add --quiet option to llvm-gsymutil to suppress output of warnings. Differential Revision: https://reviews.llvm.org/D102829	2021-05-27 12:36:34 +00:00
Mats Petersson	86627be233	Revert "[OpenMP]Add support for workshare loop modifier in lowering" This reverts commit `ea4c5fb04c`.	2021-05-27 13:09:47 +01:00
Mats Petersson	ea4c5fb04c	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 12:28:27 +01:00
Amara Emerson	9f39ba13b5	[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR. Thhis is a port from the DAG legalization. We're still missing some of the canonicalizations of shuffles but it's a start. Differential Revision: https://reviews.llvm.org/D102828	2021-05-27 00:28:38 -07:00
Hasyimi Bahrudin	8d25762720	Fix non-global-value-max-name-size not considered by LLParser `non-global-value-max-name-size` is used by `Value` to cap the length of local value name. However, this flag is not considered by `LLParser`, which leads to unexpected `use of undefined value error`. The fix is to move the responsibility of capping the length to `ValueSymbolTable`. The test is the one provided by [[ https://bugs.llvm.org/show_bug.cgi?id=45899 \| Mikael in the bug report ]]. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D102707	2021-05-27 04:20:03 +00:00
Yevgeny Rouban	4d26f41f76	[RS4GC] Introduce intrinsics to get base ptr and offset There can be a need for some optimizations to get (base, offset) for any GC pointer. The base can be calculated by generating needed instructions as it is done by the RewriteStatepointsForGC::findBasePointer() function. The offset can be calculated in the same way. Though to not expose the base calculation and to make the offset calculation as simple as ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal outside RS4GC, this patch introduces 2 intrinsics: @llvm.experimental.gc.get.pointer.base(%derived_ptr) @llvm.experimental.gc.get.pointer.offset(%derived_ptr) These intrinsics are inlined by RS4GC along with generation of statepoint sequences. With these new intrinsics the GC parseable lowering for atomic memcpy intrinsics (`6ec2c5e402`) could be implemented as a separate pass. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100445	2021-05-27 09:14:14 +07:00
Jessica Paquette	324af79dbc	[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls There were a bunch of lost debug location remarks that show up when legalizing tail calls on AArch64. This would happen because we drop the return in the block where we emit the tail call. So, we end up dropping the debug location, which makes the LostDebugLocObserver report a missing debug location. Although it's true that we lose these debug locations, this isn't a particularly useful remark. We expect to drop these debug locations when emitting tail calls. Suppressing remarks in this case is preferable, since the amount of noise could hide actual debug location related bugs. To do this, I just plumbed the LostDebugLocObserver through the relevant LegalizerHelper functions. This is the only case I can think of where we need the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it in the LegalizerHelper proper and mucking around with the constructors, I figured it'd be cleanest to take the simplest path for now. This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at -Os. Differential Revision: https://reviews.llvm.org/D103128	2021-05-26 17:16:11 -07:00
Jacob Hegna	1494fa6943	Update documentation for InlineModel features. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D103193	2021-05-26 12:52:28 -07:00
Jeremy Morse	8496fc2ec8	[DebugInstrRef][1/3] Track PHI values through register allocation This patch introduces "DBG_PHI" instructions, a marker of where a PHI instruction used to be, before PHI elimination. Under the instruction referencing model, we want to know where every value in the function is defined -- and a PHI, even if implicit, is such a place. Just like instruction numbers, we can use this to identify a value to be used as a variable value, but we don't need to know what instruction defines that value, for example: bb1: DBG_PHI $rax, 1 [... more insts ... ] bb2: DBG_INSTR_REF 1, 0, !1234, !DIExpression() This specifies that on entry to bb1, whatever value is in $rax is known as value number one -- and the later DBG_INSTR_REF marks the position where variable !1234 should take on value number one. PHI locations are stored in MachineFunction for the duration of the regalloc phase in the DebugPHIPositions map. The map is populated by PHIElimination, and then flushed back into the instruction stream by virtregrewriter. A small amount of maintenence is needed in LiveDebugVariables to account for registers being split, but only for individual positions, not for entire ranges of blocks. Differential Revision: https://reviews.llvm.org/D86812	2021-05-26 20:24:00 +01:00
Philip Reames	ff08c3468f	[SCEV] Compute trip multiple for multiple exit loops This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case. Differential Revision: https://reviews.llvm.org/D103189	2021-05-26 11:52:25 -07:00
Heejin Ahn	5dd86aadf0	[WebAssembly] Add TargetInstrInfo::getCalleeOperand DwarfDebug unconditionally assumes for all call instructions the 0th operand is the callee operand, which seems to be true for other targets, but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand` method whose default implementation returns `getOperand(0)` and makes WebAssembly overrides it to use its own utility method to get the callee operand. This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was uncovered by this CL. Reviewed By: dschuff, djtodoro Differential Revision: https://reviews.llvm.org/D102978	2021-05-26 11:43:59 -07:00
Philip Reames	9306bb638f	[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context. I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired. Differential Revision: https://reviews.llvm.org/D103182	2021-05-26 11:18:25 -07:00
Fangrui Song	73a1179535	[llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names In objdump, many targets support `-M no-aliases`. Instead of having a `-*-no-aliases` for each target when LLVM adds the support, it makes more sense to introduce objdump style `-M`. -riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D103004	2021-05-26 10:43:32 -07:00
Philip Reames	921d3f7af0	[SCEV] Add a utility for converting from "exit count" to "trip count" (Mostly as a logical place to put a comment since this is a reoccuring confusion.)	2021-05-26 10:41:49 -07:00
Philip Reames	fb14577d0c	[SCEV] Extract out a helper for computing trip multiples	2021-05-26 10:15:03 -07:00
Philip Reames	9cc2181ec3	[unroll] Use value domain for symbolic execution based cost model The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost. By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928. Differential Revision: https://reviews.llvm.org/D102934	2021-05-26 08:41:25 -07:00
Jonas Paulsson	d058262b14	[SystemZ] Support i128 inline asm operands. Support virtual, physical and tied i128 register operands in inline assembly. i128 is on SystemZ not really supported and is not a legal type and generally such a value will be split into two i64 parts. There are however some instructions that require a pair of two GPR64 registers contained in the GR128 bit reg class, which is untyped. For inline assmebly operands, it proved to be very cumbersome to first follow the general behavior of splitting an i128 operand into two parts and then later rebuild the INLINEASM MI to have one GR128 register. Instead, some minor common code changes were made to SelectionDAGBUilder to only create one GR128 register part to begin with. In particular: - getNumRegisters() now has an optional parameter "RegisterVT" which is passed by AddInlineAsmOperands() and GetRegistersForValue(). - The bitcasting in GetRegistersForValue is not performed if RegVT is Untyped. - The RC for a tied use in AddInlineAsmOperands() is now computed either from the tied def (virtual register), or by getMinimalPhysRegClass() (physical register). - InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register class (DstRC) can also be computed for an illegal type. In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() have been implemented to handle i128 operands. Differential Revision: https://reviews.llvm.org/D100788 Review: Ulrich Weigand	2021-05-26 10:08:32 -05:00
Kerry McLaughlin	9f76a85260	[LoopVectorize] Enable strict reductions when allowReordering() returns false When loop hints are passed via metadata, the allowReordering function in LoopVectorizationLegality will allow the order of floating point operations to be changed: bool allowReordering() const { // When enabling loop hints are provided we allow the vectorizer to change // the order of operations that is given by the scalar loop. This is not // enabled by default because can be unsafe or inefficient. The -enable-strict-reductions flag introduced in D98435 will currently only vectorize reductions in-loop if hints are used, since canVectorizeFPMath() will return false if reordering is not allowed. This patch changes canVectorizeFPMath() to query whether it is safe to vectorize the loop with ordered reductions if no hints are used. For testing purposes, an additional flag (-hints-allow-reordering) has been added to disable the reordering behaviour described above. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101836	2021-05-26 13:59:12 +01:00
Tomas Matheson	165321b3d2	[MC][ELF] Emit unique sections for different flags Global values imply flags such as readable, writable, executable for the sections that they will be placed in. Currently MC places all such entries into the same section, using the first set of flags seen. This can lead to situations in LTO where a writable global is placed in the same named section as a readable global from another file, and the section may not be marked writable. D72194 ensures that mergeable globals with explicit sections are placed in separate sections with compatible entry size, by emitting the `unique` assembly syntax where appropriate. This change extends that approach to include section flags, so that globals with different section flags are emitted in separate unique sections. Differential revision: https://reviews.llvm.org/D100944	2021-05-26 11:51:29 +01:00
Esme-Yi	bf809cd165	[NFC][object] Change the input parameter of the method isDebugSection. Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102601	2021-05-26 08:47:53 +00:00
Arthur Eubanks	ad90a6be21	[OpaquePtr] Create new bitcode encoding for atomicrmw Since the opaque pointer type won't contain the pointee type, we need to separately encode the value type for an atomicrmw. Emit this new code for atomicrmw. Handle this new code and the old one in the bitcode reader. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103123	2021-05-25 16:30:34 -07:00
Kevin Athey	52ac114771	LLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use-after-return-mode. Rework all tests that interact with use after return to correctly handle the case where the mode has been explicitly set to Never or Always. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D102462	2021-05-25 16:17:39 -07:00
Fangrui Song	b426b45d10	[Internalize] Rename instead of removal if a to-be-internalized comdat has more than one member Beside the `comdat any` deduplication feature, instrumentations use comdat to establish dependencies among a group of sections, to prevent section based linker garbage collection from discarding some members without discarding all. LangRef acknowledges this usage with the following wording: > All global objects that specify this key will only end up in the final object file if the linker chooses that key over some other key. On ELF, for PGO instrumentation, a `__llvm_prf_cnts` section and its associated `__llvm_prf_data` section are placed in the same GRP_COMDAT group. A `__llvm_prf_data` is usually not referenced and expects the liveness of its associated `__llvm_prf_cnts` to retain it. The `setComdat(nullptr)` code (added by D10679) in InternalizePass can break the use case (a `__llvm_prf_data` may be dropped with its associated `__llvm_prf_cnts` retained). The main goal of this patch is to fix the dependency relationship. I think it makes sense for InternalizePass to internalize a comdat and thus suppress the deduplication feature, e.g. a relocatable link of a regular LTO can create an object file affected by InternalizePass. If a non-internal comdat in a.o is prevailed by an internal comdat in b.o, the a.o references to the comdat definitions will be non-resolvable (references cannot bind to STB_LOCAL definitions in b.o). On PE-COFF, for a non-external selection symbol, deduplication is naturally suppressed with link.exe and lld-link. However, this is fuzzy on ELF and I tend to believe the spec creator has not thought about this use case (see D102973). GNU ld and gold are still using the "signature is name based" interpretation. So even if D102973 for ld.lld is accepted, for portability, a better approach is to rename the comdat. A comdat with one single member is the common case, leaving the comdat can waste (sizeof(Elf64_Shdr)+4*2) bytes, so we optimize by deleting the comdat; otherwise we rename the comdat. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103043	2021-05-25 14:15:27 -07:00
Nikita Popov	6300c37a46	[SCEV] Cache operands used in BEInfo (NFC) When memoized values for a SCEV expressions are dropped, we also drop all BECounts that make use of the SCEV expression. This is done by iterating over all the ExitNotTaken counts and (recursively) checking whether they use the SCEV expression. If there are many exits, this will take a lot of time. This patch improves the situation by pre-computing a set of all used operands, so that we can determine whether a certain BEInfo needs to be invalidated using a simple set lookup. Will still need to loop over all BEInfos though. This makes for a mild improvement on non-degenerate cases: https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384, for n=128 I'm seeing run time drop from 1.6s to 1.1s. Differential Revision: https://reviews.llvm.org/D102796	2021-05-25 21:03:33 +02:00
Michael Liao	c9dd29925f	[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics. - When memory intrinsics, such as memcpy, the attached scoped AA metadata is not passed down to the backend. As a result, the backend cannot schedule relevant memory operations around them following that hint. In this patch, SelectionDAG is enhanced to propagate that metadata (scoped AA only) when they are lowered into loads and stores. Differential Revision: https://reviews.llvm.org/D102215	2021-05-25 14:42:26 -04:00
Philip Reames	aabca2d1da	[SCEV] Cleanup doesIVOverflowOnX checks [NFC] Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.	2021-05-25 10:12:24 -07:00
Philip Reames	a47b2d4567	[SCEV] Remove unused parameter from computeBECount [NFC] All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.	2021-05-25 09:58:56 -07:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
Anirudh Prasad	993f38d0a7	[SystemZ][z/OS] Implement getHostCPUName for z/OS - Currently, the host cpu information is not easily available on z/OS as in other platforms. - This information is stored in the Communications Vector Table (https://www.ibm.com/docs/en/zos/2.2.0?topic=information-cvt-mapping) Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D102793	2021-05-25 11:18:12 -04:00
Marco Elver	280333021e	[SanitizeCoverage] Add support for NoSanitizeCoverage function attribute We really ought to support no_sanitize("coverage") in line with other sanitizers. This came up again in discussions on the Linux-kernel mailing lists, because we currently do workarounds using objtool to remove coverage instrumentation. Since that support is only on x86, to continue support coverage instrumentation on other architectures, we must support selectively disabling coverage instrumentation via function attributes. Unfortunately, for SanitizeCoverage, it has not been implemented as a sanitizer via fsanitize= and associated options in Sanitizers.def, but rolls its own option fsanitize-coverage. This meant that we never got "automatic" no_sanitize attribute support. Implement no_sanitize attribute support by special-casing the string "coverage" in the NoSanitizeAttr implementation. To keep the feature as unintrusive to existing IR generation as possible, define a new negative function attribute NoSanitizeCoverage to propagate the information through to the instrumentation pass. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035 Reviewed By: vitalybuka, morehouse Differential Revision: https://reviews.llvm.org/D102772	2021-05-25 12:57:14 +02:00
Stanislav Mekhanoshin	8f681d5b27	[IR] Allow Value::replaceUsesWithIf() to process constants The change is currently NFC, but exploited by the depending D102954. Code to handle constants is borrowed from the general implementation of Value::doRAUW(). Differential Revision: https://reviews.llvm.org/D103051	2021-05-25 02:12:01 -07:00
David Spickett	de7729d47a	[clang][ARM] Remove non-existent arm9312 CPU I cannot find documentation on this CPU, and it is not supported by the Arm Compiler 5 product either. It was likely a mistake or a different name for the "ep9312", which is an Arm based Cirrus Logic chip. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D103024	2021-05-25 08:58:24 +00:00
David Spickett	5f4d383a59	[clang][ARM] Remove non-existent arm1136jz-s CPU There is an ARM1136JF-S and an ARM1136J-S but I could find no references to an ARM1136JZ-S. In CPU manuals or the manual for Arm Compiler 5. See: https://developer.arm.com/documentation/ddi0211/latest/ https://developer.arm.com/documentation/dui0472/latest/ Using this CPU you get: $ ./bin/clang --target=arm-linux-gnueabihf -march=armv3m -mcpu=arm1136jz-s -c /tmp/test.c -o /tmp/test.o 'arm1136jz-s' is not a recognized processor for this target (ignoring processor) Since the llvm target does not know what it is. This is part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D103019	2021-05-25 08:54:59 +00:00
Lang Hames	82ad2b6e94	[JITLink] Enable creation and management of mutable block content. This patch introduces new operations on jitlink::Blocks: setMutableContent, getMutableContent and getAlreadyMutableContent. The setMutableContent method will set the block content data and size members and flag the content as mutable. The getMutableContent method will return a mutable copy of the existing content value, auto-allocating and populating a new mutable copy if the existing content is marked immutable. The getAlreadyMutableMethod asserts that the existing content is already mutable and returns it. setMutableContent should be used when updating the block with totally new content backed by mutable memory. It can be used to change the size of the block. The argument value should not be shared with any other block. getMutableContent should be used when clients want to modify the existing content and are unsure whether it is mutable yet. getAlreadyMutableContent should be used when clients want to modify the existing content and know from context that it must already be immutable. These operations reduce copy-modify-update boilerplate and unnecessary copies introduced when clients couldn't me sure whether the existing content was mutable or not.	2021-05-24 22:09:36 -07:00
Arthur Eubanks	a2ae14514a	Making Instrumentation aware of LoopNest Pass Intrumentation callbacks are not made aware of LoopNest passes. From the loop pass manager, we can pass the outermost loop of the LoopNest to instrumentation in case of LoopNest passes. The current patch made the change in two places in StandardInstrumentation.cpp. I will submit a proper patch where the OuterMostLoop is passed from the LoopPassManager to the call backs. That way we will avoid making changes at multiple places in StandardInstrumentation.cpp. A testcase also will be submitted. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102463	2021-05-24 20:25:52 -07:00
maekawatoshiki	e77d24f70a	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit `d65c32fb41`.	2021-05-25 11:39:49 +09:00
David Blaikie	a08673d04a	Add a range-based wrapper for std::unique(begin, end, binary_predicate)	2021-05-24 17:26:46 -07:00
Jon Roelofs	095e91c973	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Re-landing now that the crasher this patch previously uncovered has been fixed in: https://reviews.llvm.org/D102935 Differential revision: https://reviews.llvm.org/D102452	2021-05-24 10:10:44 -07:00
David Green	543406a69b	[ARM] Allow findLoopPreheader to return headers with multiple loop successors The findLoopPreheader function will currently not find a preheader if it branches to multiple different loop headers. This patch adds an option to relax that, allowing ARMLowOverheadLoops to process more loops successfully. This helps with WhileLoopStart setup instructions that can branch/fallthrough to the low overhead loop and to branch to a separate loop from the same preheader (but I don't believe it is possible for both loops to be low overhead loops). Differential Revision: https://reviews.llvm.org/D102747	2021-05-24 12:22:15 +01:00
Johannes Doerfert	6caea8a7fa	[Attributor] Introduce a helper do deal with constant type mismatches If we simplify values we sometimes end up with type mismatches. If the value is a constant we can often cast it though to still allow propagation. The logic is now put into a helper and it replaces some ad hoc things we did before. This also introduces the AA namespace for abstract attribute related functions and types.	2021-05-23 23:00:40 -05:00
Johannes Doerfert	5cdc29f795	[Attributor][FIX] Ensure we replace undef if we see the first "real" value The state of AAPotentialValues tracks if undef is contained. It should fold undef into the first non-undef value. However we missed a case before. There was also a shadowing definition of two variables that caused trouble. The test exposes both problems.	2021-05-23 20:47:06 -05:00
Fady Ghanim	766ad7d0aa	[OpenMP][OMPIRBuilder]Adding support for `omp atomic` This patch adds support for generating `omp atomic` for all different atomic clauses	2021-05-23 17:44:09 -04:00
Philipp Krones	c2f819af73	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921	2021-05-23 14:15:23 -07:00
Fangrui Song	fc82507c89	[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics The prevailing style does not add the message. The directive name is not useful because the next line replicates the error line which includes the directive.	2021-05-23 13:58:16 -07:00

... 3 4 5 6 7 ...

45424 Commits