llvm-project

Commit Graph

Author	SHA1	Message	Date
Chen Zheng	61484762e9	[Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102207	2021-05-13 21:15:06 -04:00
Amara Emerson	af6eb1c710	[AArch64][GlobalISel] Fix a crash during unsuccessful G_CTPOP <2 x s64> legalization. The legalization rule for scalar-same-as doesn't handle vectors. Until we implement custom legalization for this, at least fall back properly.	2021-05-13 17:28:11 -07:00
Arthur Eubanks	2155dc51d7	[IR] Introduce the opaque pointer type The opaque pointer type is essentially just a normal pointer type with a null pointee type. This also adds support for the opaque pointer type to the bitcode reader/writer, as well as to textual IR. To avoid confusion with existing pointer types, we disallow creating a pointer to an opaque pointer. Opaque pointer types should not be widely used at this point since many parts of LLVM still do not support them. The next steps are to add some very simple use cases of opaque pointers to make sure they work, then start pretending that all pointers are opaque pointers and see what breaks. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150359.html Reviewed By: dblaikie, dexonsmith, pcc Differential Revision: https://reviews.llvm.org/D101704	2021-05-13 15:22:27 -07:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Max Kazantsev	d8b37de8a4	[GC][NFC] Move GCStrategy from CodeGen to IR We want it to be available in analyzes so that we could use the CodeGen notion in middle-end passes (for example, to check if a GC may free some particular pointer). This is a preparatory patch that simply moves the files around. Note: if this causes some build issues, this patch must just be reverted. Differential Revision: https://reviews.llvm.org/D100557 Reviewed By: reames	2021-05-13 12:31:59 +07:00
Lang Hames	2f21a272af	[JITLink] Expose x86-64 pointer jump stub block construction. This can be useful for clients who want to define their own symbol for the stub, or re-use some existing symbol.	2021-05-12 22:28:14 -07:00
Lang Hames	4b0f5edd36	[JITLink] Add a transferDefinedSymbol operation. The transferDefinedSymbol operation updates a Symbol's target block, offset, and size. This can be convenient when you want to redefine the content of some symbol(s) pointing at a block, while retaining the original block in the graph.	2021-05-12 22:28:14 -07:00
Anton Afanasyev	ab2c499d3a	[SLP] Add insertelement instructions to vectorizable tree Add new type of tree node for `InsertElementInst` chain forming vector. These instructions could be either removed, or replaced by shuffles during vectorization and we can add this node to cost model, so naturally estimating their cost, getting rid of `CompensateCost` tricks and reducing further work for InstCombine. This fixes PR40522 and PR35732 in a natural way. Also this patch is the first step towards revectorization of partially vectorization (to fix PR42022 completely). After adding inserts to tree the next step is to add vector instructions there (for instance, to merge `store <2 x float>` and `store <2 x float>` to `store <4 x float>`). Fixes PR40522 and PR35732. Differential Revision: https://reviews.llvm.org/D98714	2021-05-13 07:41:45 +03:00
Chen Zheng	a0ca4c46ca	[Debug-Info] add -gstrict-dwarf support in backend Reviewed By: dblaikie, probinson Differential Revision: https://reviews.llvm.org/D100826	2021-05-12 23:00:52 -04:00
Sam Clegg	3041b16f73	[WebAssembly] Add TLS data segment flag: WASM_SEG_FLAG_TLS Previously the linker was relying solely on the name of the segment to imply TLS. Differential Revision: https://reviews.llvm.org/D102202	2021-05-12 13:31:02 -07:00
Craig Topper	44e0e91db0	[ValueTypes] Rename MVT::getVectorNumElements() to MVT::getVectorMinNumElements(). Fix some misuses of getVectorNumElements() getVectorNumElements() returns a value for scalable vectors without any warning so it is effectively getVectorMinNumElements(). By renaming it and making getVectorNumElements() forward to it, we can insert a check for scalable vectors into getVectorNumElements() similar to EVT. I didn't do that in this patch because there are still more fixes needed, but I was able to temporarily do it and passed the RISCV lit tests with these changes. The changes to isPow2VectorType and getPow2VectorType are copied from EVT. The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table. We're now considering SameNumElts to require the scalable property to match which removes some unneeded type checks. This was motivated by the bug I fixed yesterday in `80b9510806` Reviewed By: frasercrmck, sdesmalen Differential Revision: https://reviews.llvm.org/D102262	2021-05-12 07:46:45 -07:00
Martin Storsjö	4b98199ce8	[Passes] Reenable the relative lookup table converter pass for ELF and COFF on aarch64 The bug (PR50227, affecting COFF) that caused the revert in `6f5670a4c3` has been fixed in `382c505d9c` now, so it should be safe to reenable the pass for that target (and ELF). In PR50227 it's also mentioned that the same pass seems to cause problems on aarch64 on darwin, so leaving it disabled there for now.	2021-05-12 16:42:11 +03:00
Stephen Tozer	fdb055f4f1	Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" Previous crashes caused by this patch were the result of machine subregisters being incorrectly handled in updateDbgUsersToReg; this has been fixed by using RegUnits to determine overlapping registers, instead of using the register values directly. Differential Revision: https://reviews.llvm.org/D101523 This reverts commit `7ca26c5fa2`.	2021-05-12 10:19:57 +01:00
Matt Arsenault	6ecbdb761f	GlobalISel: Make constant fields const	2021-05-11 20:10:55 -04:00
Matt Arsenault	24e2e5df0e	GlobalISel: Split ValueHandler into assignment and emission classes Currently the ValueHandler handles both selecting the type and location for arguments, as well as inserting instructions needed to handle them. Split this so that the determination of the argument handling is independent of the function state. Currently the checks for tail call compatibility do not follow the full assignment logic, so it misses cases where arguments require nontrivial legalization. This should help avoid targets ending up in a buggy state where the argument evaluation may change in different contexts.	2021-05-11 19:50:12 -04:00
Matt Arsenault	2bdfcf0cac	GlobalISel: Move AArch64 AssignFnVarArg to base class We can handle the distinction easily enough in the generic code, and this makes it easier to abstract the selection of type/location from the code to insert code.	2021-05-11 19:50:12 -04:00
Jordan Rupprecht	fec2945998	Revert "[GVN] Clobber partially aliased loads." This reverts commit `6c57044231`. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Petr Hosek	8280ece0c9	[Coverage] Support overriding compilation directory When making compilation relocatable, for example in distributed compilation scenarios, we want to set compilation dir to a relative value like `.` but this presents a problem when generating reports because if the file path is relative as well, for example `..`, you may end up writing files outside of the output directory. This change introduces a flag that allows overriding the compilation directory that's stored inside the profile with a different value that is absolute. Differential Revision: https://reviews.llvm.org/D100232	2021-05-11 15:26:45 -07:00
Lang Hames	a0162a81b1	[JITLink][MachO/x86_64] Expose API for creating eh-frame fixing passes. These can be used to create eh-frame section fixing passes outside the usual linker pipeline, which can be useful for tests and tools that just want to verify or dump graphs.	2021-05-11 15:26:16 -07:00
Lang Hames	74a96b4c98	[JITLink][x86-64] Add an x86_64 PointerSize constexpr. This can be used in place of magic '8' values in generic x86-64 utilities.	2021-05-11 15:26:15 -07:00
Alex Orlov	ebdcebfcb4	Removed unnecessary introduction of semi-colons.	2021-05-12 00:46:00 +04:00
Amara Emerson	ae2b36e8bd	[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs. This needs some tablegen changes so that we can actually import the patterns properly. Differential Revision: https://reviews.llvm.org/D102204	2021-05-11 11:33:03 -07:00
Alex Orlov	05d1ae4e18	* Add support for JSON output style to llvm-symbolizer This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96883	2021-05-11 13:10:54 +04:00
Simon Pilgrim	33399405f4	Fix -Wdocumentation warnings. NFCI.	2021-05-11 09:51:49 +01:00
Sam Clegg	3b8d2be527	Reland: "[lld][WebAssembly] Initial support merging string data" This change was originally landed in: `5000a1b4b9` It was reverted in: `061e071d8c` This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 16:03:38 -07:00
Nico Weber	061e071d8c	Revert "[lld][WebAssembly] Initial support merging string data" This reverts commit `5000a1b4b9`. Breaks tests, see https://reviews.llvm.org/D97657#2749151 Easily repros locally with `ninja check-llvm-mc-webassembly`.	2021-05-10 18:28:28 -04:00
Florian Hahn	93a9a8a8d9	[VecLib] Add support for vector fns from Darwin's libsystem. This patch adds support for Darwin's libsystem math vector functions to TLI. Darwin's libsystem provides a range of vector functions for libm functions. This initial patch only adds the 2 x double and 4 x float versions, which are available on both X86 and ARM64. On X86, wider vector versions are supported as well. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D101856	2021-05-10 21:19:58 +01:00
Sam Clegg	5000a1b4b9	[lld][WebAssembly] Initial support merging string data This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 13:15:12 -07:00
Arthur Eubanks	85af8a8c1b	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. Differential Revision: https://reviews.llvm.org/D101713	2021-05-10 13:05:15 -07:00
Lang Hames	9507bace6c	[ORC] Use a unique_function rather than std::function for dispatchTask.	2021-05-10 13:04:33 -07:00
Sanjay Patel	88d8f10baf	[PassManager] add helper function to hold set of vector passes (2nd try) This is better no-functional-change-intended than the 1st attempt. As noted in D102002, there were at least 2 diffs that went unchecked in pass manager regressions tests: different pass parameters (SimplifyCFG) and an extension point/callback. Those should be lifted from the original code blocks correctly now.	2021-05-10 14:43:00 -04:00
Tomasz Miąsko	2961f86317	[Demangle][Rust] Parse basic types Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102142	2021-05-10 09:44:46 -07:00
Sanjay Patel	822be4bec8	Revert "[PassManager] add helper function to hold set of vector passes" This reverts commit `fefcb1f878`. It was supposed to be NFC, but as noted in the post-commit comments in D102002, that was not true: SimplifyCFG uses different parameters and there's a difference in an extension point / callback.	2021-05-10 10:59:30 -04:00
Fraser Cormack	3212a08a8c	[Constant] Allow ConstantAggregateZero a scalable element count A ConstantAggregateZero may be created from a scalable vector type. However, it still assumed fixed number of elements when queried for them. This patch changes ConstantAggregateZero to correctly report its element count. This change fixes a couple of issues. Firstly, it fixes a crash in Constant::getUniqueValue when called on a scalable-vector zeroinitializer constant. Secondly, it fixes a latent bug in GlobalISel's IRTranslator in which translating a scalable-vector zeroinitializer would hit the assertion in ConstantAggregateZero::getNumElements when casting to a FixedVectorType, rather than reporting an error more gracefully. This is currently hypothetical as the IRTranslator has deeper issues preventing the use of scalable vector types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102082	2021-05-10 13:51:53 +01:00
Mats Petersson	7280f4b279	[OpenMP][MLIR]Add support for guided, auto and runtime scheduling When using parallel loop construct, the OpenMP specification allows for guided, auto and runtime as scheduling variants (as well as static and dynamic which are already supported). This adds the translation from MLIR to LLVM-IR for these scheduling variants. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101435	2021-05-10 09:18:52 +00:00
Lang Hames	7f9a89f9a2	[ORC] Use the new dispatchTask API to run query callbacks. Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.	2021-05-09 19:19:40 -07:00
Lang Hames	5344c88dcb	[ORC] Generalize materialization dispatch to task dispatch. Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.	2021-05-09 19:19:39 -07:00
Tomasz Miąsko	78e949159d	[Demangle][Rust] Print special namespaces Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101821	2021-05-09 15:45:57 -07:00
Andrea Di Biagio	9ceea66602	[MCA][RegisterFile] Refactor the move elimination logic to address PR50258. This patch lifts the restriction on the number of read/write registers for a move elimination candidate. With this patch, move elimination candidates with exactly two reads and two writes are treated like register swap operations for the purpose of move elimination. This patch currently doesn't affect any upstream model. However, it should help unblock the progress on PR50258.	2021-05-08 18:10:35 +01:00
Simon Pilgrim	ab5ee342b9	[GlobalISel] Ensure MachineIRBuilder::getDebugLoc() returns a const reference. NFCI. Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.	2021-05-08 16:23:28 +01:00
Xiang1 Zhang	d4bdeca576	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-05-08 14:21:11 +08:00
Arthur Eubanks	72bd0116e3	Fix build after `34a8a437b`	2021-05-07 23:18:44 -07:00
Xiang1 Zhang	bebafe01a7	Revert "[X86] Support AMX fast register allocation" This reverts commit `77e2e5e07d`.	2021-05-08 13:43:32 +08:00
Xiang1 Zhang	77e2e5e07d	[X86] Support AMX fast register allocation	2021-05-08 13:27:21 +08:00
Michael Liao	631da3b152	Replace a remaining CRLF with LF. NFC.	2021-05-08 01:09:15 -04:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Amara Emerson	5b158093e2	[AArch64][GlobalISel] Create a new minimal combiner pass just for -O0. We never bothered to have a separate set of combines for -O0 in the prelegalizer before. This results in some minor performance hits for a mode where performance isn't a concern (although not regressing code size significantly is still preferable). This also removes the CSE option since we don't need it for -O0. Through experiments, I've arrived at a set of combines that gets the most code size improvement at -O0, while reducing the amount of time spent in the combiner by around 35% give or take. Differential Revision: https://reviews.llvm.org/D102038	2021-05-07 17:01:27 -07:00
Arthur Eubanks	6f7131002b	[NewPM] Move analysis invalidation/clearing logging to instrumentation We're trying to move DebugLogging into instrumentation, rather than being part of PassManagers/AnalysisManagers. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102093	2021-05-07 15:25:31 -07:00
Arthur Eubanks	7ca26c5fa2	Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" This reverts commit `0791f968fe`. Causing crashes: https://crbug.com/1206764	2021-05-07 12:05:16 -07:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Benjamin Kramer	6248d11190	Retire TargetRegisterInfo::getSpillAlignment getSpillAlign does the same thing.	2021-05-07 15:16:22 +02:00
Simon Pilgrim	280aa3415e	[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts. Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication. I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to). NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly. Differential Revision: https://reviews.llvm.org/D101987	2021-05-07 13:12:30 +01:00
Stephen Tozer	0791f968fe	[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands. Differential Revision: https://reviews.llvm.org/D101523	2021-05-07 11:47:50 +01:00
Guillaume Chatelet	e805b7c2d6	[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen Differential Revision: https://reviews.llvm.org/D102058	2021-05-07 10:22:41 +00:00
Guillaume Chatelet	eb1b26ec1d	[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions Differential Revision: https://reviews.llvm.org/D102056	2021-05-07 10:21:35 +00:00
Sebastian Neubauer	98e5ede604	[AMDGPU] Serialize MFInfo::ScavengeFI Serialize ScavengeFI from SIMachineFunctionInfo into yaml. ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything. Differential Revision: https://reviews.llvm.org/D101367	2021-05-07 11:15:25 +02:00
Amara Emerson	1ccebb18ef	[GlobalISel] Micro-optimize the conditional branch optimization. Convert a check into an assert and pass an MI instead of recomputing in the apply function.	2021-05-07 00:03:09 -07:00
Chen Zheng	a95473c563	[XCOFF] handle string constants generation for AIX This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101280	2021-05-07 06:43:36 +00:00
qixingxue	e388b9399b	[IR] Fix typo in comment of Intrinsics.td (NFC)	2021-05-07 13:21:58 +08:00
Cyndy Ishida	c4ed142e69	[llvm][TextAPI] add mapping from OS string to Platform * add utility for matching target triple OS value strings to PlatformKind This was reviewed offline by ributzka, steven_wu	2021-05-06 16:25:56 -07:00
Stanislav Mekhanoshin	c714d03785	[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 Differential Revision: https://reviews.llvm.org/D102022	2021-05-06 16:17:33 -07:00
Sanjay Patel	fefcb1f878	[PassManager] add helper function to hold set of vector passes This is no-functional-change-intended (NFC) and split off from D102002 (which proposes to eliminate the LTO-based differences).	2021-05-06 15:36:15 -04:00
Mircea Trofin	97ab068034	[NPM] Do not run function simplification pipeline unnecessarily The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1]. This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since. The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios. [1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions \| compile time tracker ]] run with the option enabled. Differential Revision: https://reviews.llvm.org/D98103	2021-05-06 12:24:33 -07:00
Kerry McLaughlin	8c9742bd23	[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences Adds support for scalable vectorization of loops containing first-order recurrences, e.g: ``` for(int i = 0; i < n; i++) b[i] = a[i] + a[i - 1] ``` This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into account when inserting into and extracting from the last lane of a vector. CreateVectorSplice has been added to construct a vector for the recurrence, which returns a splice intrinsic for scalable types. For fixed-width the behaviour remains unchanged as CreateVectorSplice will return a shufflevector instead. The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D101076	2021-05-06 11:35:39 +01:00
Guillaume Chatelet	089ec047be	[llvm][NFC] Remove CallingConvLower deprecated alignment functions Differential Revision: https://reviews.llvm.org/D101910	2021-05-06 07:46:19 +00:00
Guillaume Chatelet	1fa21bf9e9	[llvm][NFC] Remove SelectionDag alignment deprecated functions Differential Revision: https://reviews.llvm.org/D101909	2021-05-06 07:44:14 +00:00
Guillaume Chatelet	040f4a97cd	[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function. Differential Revision: https://reviews.llvm.org/D101907	2021-05-06 07:40:18 +00:00
Guillaume Chatelet	a065efa302	[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions Differential Revision: https://reviews.llvm.org/D101906	2021-05-06 07:28:00 +00:00
Guillaume Chatelet	b4795544d4	[llvm][NFC] Remove deprecated Alignment::None() Differential Revision: https://reviews.llvm.org/D101905	2021-05-06 07:21:23 +00:00
Johannes Doerfert	df729e2b82	[OpenMP] Overhaul `declare target` handling This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well. This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well. I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already. Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary. --- Notes: The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall. After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around. --- Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D101030	2021-05-06 02:10:41 -05:00
Lang Hames	7b73cd684a	[ORC] Introduce C API for adding object buffers directly to an object layer. This can be useful for clients constructing custom JIT stacks: If the C API for your custom stack exposes API to obtain a reference to an object layer (e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT functions can be used to add objects directly to that layer.	2021-05-05 19:02:13 -07:00
RamNalamothu	41f8b8e807	[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling This change enables emitting CFI unwind information for debugging purpose for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None. Currently generating CFI unwind information is entangled with supporting the exceptions, even when AsmPrinter explicitly recognizes that the unwind tables are being generated as debug information. In fact, the unwind information is not generated even if we specify --force-dwarf-frame-section, unless exceptions are enabled. The LIT test llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior. Enable this option for AMDGPU to prepare for future patches which add complete CFI support. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D78778	2021-05-06 04:53:45 +05:30
Matt Arsenault	fa0b93b5a0	GlobalISel: Use DAG call lowering infrastructure in a more compatible way Unfortunately the current call lowering code is built on top of the legacy MVT/DAG based code. However, GlobalISel was not using it the same way. In short, the DAG passes legalized types to the assignment function, and GlobalISel was passing the original raw type if it was simple. I do believe the DAG lowering is conceptually broken since it requires picking a type up front before knowing how/where the value will be passed. This ends up being a problem for AArch64, which wants to pass i1/i8/i16 values as a different size if passed on the stack or in registers. The argument type decision is split across 3 different places which is hard to follow. SelectionDAG builder uses getRegisterTypeForCallingConv to pick a legal type, tablegen gives the illusion of controlling the type, and the target may have additional hacks in the C++ part of the call lowering. AArch64 hacks around this by not using the standard AnalyzeFormalArguments and special casing i1/i8/i16 by looking at the underlying type of the original IR argument. I believe people have generally assumed the calling convention code is processing the original types, and I've discovered a number of dead paths in several targets. x86 actually relies on the opposite behavior from AArch64, and relies on x86_32 and x86_64 sharing calling convention code where the 64-bit cases implicitly do not work on x86_32 due to using the pre-legalized types. AMDGPU targets without legal i16/f16 have always used a broken ABI that promotes to i32/f32. GlobalISel accidentally fixed this to be the ABI we should have, but this fixes it so we're using the worse ABI that is compatible with the DAG. Ideally we would fix the DAG to match the old GlobalISel behavior, but I don't wish to fight that battle. A new native GlobalISel call lowering framework should let the target process the incoming types directly. CCValAssigns select a "ValVT" and "LocVT" but the meanings of these aren't entirely clear. Different targets don't use them consistently, even within their own call lowering code. My current belief is the intent was "ValVT" is supposed to be the legalized value type to use in the end, and and LocVT was supposed to be the ABI passed type (which is also legalized). With the default CCState::Analyze functions always passing the same type for these arguments, these only differ when the TableGen part of the lowering decide to promote the type from one legal type to another. AArch64's i1/i8/i16 hack ends up inverting the meanings of these values, so I had to add an additional hack to let the target interpret how large the argument memory is. Since targets don't consistently interpret ValVT and LocVT, this doesn't produce quite equivalent code to the initial DAG lowerings. I've opted to consistently interpret LocVT as the in-memory size for stack passed values, and ValVT as the register type to assign from that memory. We therefore produce extending loads directly out of the IRTranslator, whereas the DAG would emit regular loads of smaller values. This will also produce loads/stores that are wider than the argument value if the allocated stack slot is larger (and there will be undef padding bytes). If we had the optimizations to reduce load/stores based on truncated values, this wouldn't produce a different end result. Since ValVT/LocVT are more consistently interpreted, we now will emit more G_BITCASTS as requested by the CCAssignFn. For example AArch64 was directly assigning types to some physical vector registers which according to the tablegen spec should have been casted to a vector with a different element type. This also moves the responsibility for inserting G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the generic code, which is closer to how SelectionDAGBuilder works. I had to xfail an x86 test since I don't see a quick way to fix it right now (I filed bug 50035 for this). It's broken independently of this change, and only triggers since now we end up with more ands which hit the improperly handled selection pattern. I also observed that FP arguments that need promotion (e.g. f16 passed as f32) are broken, and use regular G_TRUNC and G_ANYEXT. TLDR; the current call lowering infrastructure is bad and nobody has ever understood how it chooses types.	2021-05-05 17:35:02 -04:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Anirudh Prasad	ae2aef1361	[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM - As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information) - This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set - This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings` Reviewed By: abhina.sreeskantharajan, Kai Differential Revision: https://reviews.llvm.org/D101660	2021-05-05 10:21:55 -04:00
Martin Storsjö	6f5670a4c3	Revert "[Passes] Enable the relative lookup table converter pass on aarch64" This reverts commit `57b259a852`. The relative lookup table converter pass seems to cause problems for chromium on Windows/ARM64, see https://crbug.com/1204788.	2021-05-05 15:23:14 +03:00
Fangrui Song	7cac6a9d7a	[MC] Add MCAsmParser::parseComma to improve diagnostics llvm-mc will error "expected comma" instead of "unexpected token".	2021-05-04 14:13:19 -07:00
Fraser Cormack	6523ff6d47	[ValueTypes] Add MVTs for v256i16 and v256f16 This patch adds the two MVTs to fix a legalizer crash when using vector shuffles of <256 x i16> and <128 x i16> on RISC-V. The legalizer can't promote the operand of `v256i32 = any_extend_vector_inreg v128i16`. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D101769	2021-05-04 18:06:13 +01:00
Wei Mi	82956de05f	[SampleFDO] Fix a bug when appending function symbol into the Callees set of Root node in ProfiledCallGraph. In ProfiledCallGraph::addProfiledFunction, to add a function symbol into the ProfiledCallGraph, currently an uninitialized ProfiledCallGraphNode node is created by ProfiledFunctions[Name] and inserted into Callees set of Root node before the node is initialized. The Callees set use ProfiledCallGraphNodeComparer as its comparator so the uninitialized ProfiledCallGraphNode may fail to be inserted into Callees set if it happens to contain a name in memory which has been inserted into the Callees set before. The problem will prevent some function symbols from being annotated with profiles and cause performance regression. The patch fixes the problem. Differential Revision: https://reviews.llvm.org/D101815	2021-05-04 10:05:59 -07:00
Sander de Smalen	9931ae645e	Reland "[LV] Calculate max feasible scalable VF." Relands https://reviews.llvm.org/D98509 This reverts commit `51d648c119`.	2021-05-04 15:44:41 +01:00
Jan Svoboda	00895831ab	[clang][cli][docs] Clarify marshalling infrastructure documentation	2021-05-04 15:16:32 +02:00
Jan Svoboda	d0e3a15e36	[clang][cli] NFC: Remove confusing `EmptyKPM` variable	2021-05-04 14:27:57 +02:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
David Green	18883a3fec	[TTI] Replace ceil lambdas with divideCeil. NFCI As pointed out in D101726, this function already exists in MathExtras. It uses different types, but with the values used here I believe that should not make a functional difference.	2021-05-04 09:04:44 +01:00
Tomasz Miąsko	7310403e3c	[demangler] Initial support for the new Rust mangling scheme Add a demangling support for a small subset of a new Rust mangling scheme, with complete support planned as a follow up work. Intergate Rust demangling into llvm-cxxfilt and use llvm-cxxfilt for end-to-end testing. The new Rust mangling scheme uses "_R" as a prefix, which makes it easy to disambiguate it from other mangling schemes. The public API is modeled after __cxa_demangle / llvm::itaniumDemangle, since potential candidates for further integration use those. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101444	2021-05-03 16:44:30 -07:00
Arthur Eubanks	2df3426fd1	[NewPM] Invalidate AAManager after populating GlobalsAA GlobalsAA is only created at the beginning of the inliner pipeline. If an AAManager is cached from previous passes, it won't get rebuilt to include the newly created GlobalsAA. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D101379	2021-05-03 16:37:32 -07:00
Valentin Clement	63f8226f25	[OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503	2021-05-03 15:42:32 -04:00
Anirudh Prasad	ca02fab7e7	[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax () for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " " which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "" preceding the token. When you have a " * " what’s accepted is: ``` <space>.{.} -> <pc-rel-insn> 0 [+\|-][constant-value] -> <pc-rel-insn> [+\|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func+4 is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+4+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,-4+8+func is not allowed ( cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987	2021-05-03 14:58:24 -04:00
Paul Robinson	1d299252dd	[DebuggerTuning] Move a comment to a more useful place. The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.	2021-05-03 11:08:04 -07:00
Chris Lattner	5fa9d41634	[Support/Parallel] Add a special case for 0/1 items to llvm::parallel_for_each. This avoids the non-trivial overhead of creating a TaskGroup in these degenerate cases, but also exposes parallelism. It turns out that the default executor underlying TaskGroup prevents recursive parallelism - so an instance of a task group being alive will make nested ones become serial. This is a big issue in MLIR in some dialects, if they have a single instance of an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g. a firrtl.module). This patch side-steps the problem by avoiding creating the TaskGroup in the unneeded case. See this issue for more details: https://github.com/llvm/circt/issues/993 Note that this isn't a really great solution for the general case of nested parallelism. A redesign of the TaskGroup stuff would be better, but would be a much more invasive change. Differential Revision: https://reviews.llvm.org/D101699	2021-05-03 10:08:00 -07:00
Abhina Sreeskantharajan	1527a5e4b4	[SystemZ][z/OS] Add the functions needed for handling EBCDIC I/O This patch adds the basic functions needed for controlling auto conversion on z/OS. Auto conversion is enabled on untagged input file to ASCII by making the assumption that all untagged files are EBCDIC encoded. Output files are auto converted to EBCDIC IBM-1047. This change also enables conversion for stdin/stdout/stderr. For more information on how fcntl controls codepage https://www.ibm.com/docs/en/zos/2.4.0?topic=descriptions-fcntl-bpx1fct-bpx4fct-control-open-file-descriptors Reviewed By: anirudhp Differential Revision: https://reviews.llvm.org/D100483	2021-05-03 08:52:38 -04:00
David Green	d1bbe61d1c	[ARM] Memory operands for MVE gathers/scatters Similarly to D101096, this makes sure that MMO operands get propagated through from MVE gathers/scatters to the Machine Instructions. This allows extra scheduling freedom, not forcing the instructions to act as scheduling barriers. We create MMO's with an unknown size, specifying that they can load from anywhere in memory, similar to the masked_gather or X86 intrinsics. Differential Revision: https://reviews.llvm.org/D101219	2021-05-03 11:24:59 +01:00
Sergio Perez Gonzalez	761d5614a1	[Object] Fix e_machine description for EM_CR16 and add EM_MICROBLAZE Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101133	2021-05-02 19:25:39 -07:00
David Green	15b5d1a5bf	[ARM] Transfer memory operands for VLDn We create MMO's for the VLDn/VSTn intrinsics in ARMTargetLowering:: getTgtMemIntrinsic, but they do not currently make it ll the way through ISel. This changes that in the various places it needs changing, making sure that the MMO is propagate through to the final instruction. This can help in scheduling, not treating the VLD2/VST2 as a scheduling barrier. Differential Revision: https://reviews.llvm.org/D101096	2021-05-03 00:04:21 +01:00
Craig Topper	6430430958	[TableGen] Use sign rotated VBR for OPC_EmitInteger. This allows for a much more efficient encoding for small negative numbers by storing the sign bit first and negating the rest of the bits. This was already being used for OPC_CheckInteger. For every in tree target this affects, the table got smaller. R600GenDAGISel.inc saw the largest reduction of 7K. I did have to add a new opcode for StringIntegers used for register class ids and subregister indices since we don't have the integer value to encode. The enum name is emitted directly into the table. Previously assumed the enum would expand to a positive 7-bit number. We might be able to just shift that right by 1 and assume it is a positive 6 bit number, but that will need more investigation.	2021-05-02 12:40:44 -07:00
Juneyoung Lee	1977c53b2a	[InstCombine] Fold overflow bit of [u\|s]mul.with.overflow in a poison-safe way As discussed in D101191, this patch adds a poison-safe folding of overflow bit check: ``` %Op0 = icmp ne i4 %X, 0 %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = select i1 %Op0, i1 %Op1, i1 false => %Y.fr = freeze %Y %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = %Op1 ``` https://alive2.llvm.org/ce/z/zgPUGT https://alive2.llvm.org/ce/z/h2gZ_6 Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`. In this case, LLVM is already good because `%ret` is already successfully folded into `and`, triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K Differential Revision: https://reviews.llvm.org/D101423	2021-05-02 11:54:12 +09:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Nathan Chancellor	4397b7095d	Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This reverts commit `791930d740`, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. I observed breakage with the Linux kernel, as reported at https://reviews.llvm.org/D91722#2724321 Fixes exist at https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but they have not landed so to unbreak the tree for the weekend, revert this commit. Commit `b11e4c9907` ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"") only reverted one follow-up fix, not the original patch that broke the kernel. e	2021-04-30 20:23:21 -07:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Daniil Fukalov	3489c2d7b1	[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen, kparzysz Differential Revision: https://reviews.llvm.org/D101533	2021-04-30 22:51:51 +03:00
Guozhi Wei	b817ea7b17	[MachineFunction] Make comment for TracksLiveness more clearer As discussed in https://lists.llvm.org/pipermail/llvm-dev/2021-April/150225.html, the current comments for TracksLiveness property and isKill flag are confusing. This patch makes the comments more clearer. Differential Revision: https://reviews.llvm.org/D101500	2021-04-30 12:10:36 -07:00
Scott Linder	f3026d8b8d	[ADT] Add llvm::remove_cvref and llvm::remove_cvref_t Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100669	2021-04-30 18:22:38 +00:00
Scott Linder	c6f20d70a8	[ADT] Add STLForwardCompat.h and llvm::disjunction Move some types in STLExtras.h which are named and behave identically to STL types from future standards into a dedicated header. This keeps them organized (they are not "extras" in the same sense as most types in STLExtras.h are) and fixes circular dependencies in future patches. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100668	2021-04-30 17:28:47 +00:00
Paul C. Anagnostopoulos	985ab6e1fa	[TableGen] Fix two bugs in 'defm' when complex 'assert' is involved. This patch fixes two bugs that arise when a 'defm' inherits from a multiclass and also from a class with assertions. Differential Revision: https://reviews.llvm.org/D101626	2021-04-30 11:31:06 -04:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Dominik Montada	97ed1b6036	[GISel] Teach TableGen to check predicates of immediate operands in patterns Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D91703	2021-04-30 10:18:45 +02:00
Matt Arsenault	1cf3d68f97	VirtRegMap: Add pass option to not clear virt regs In a future change it will be possible to run register allocation with a specific set of register classes, so some of the remaining virtual registers will still be meaningful.	2021-04-29 21:08:47 -04:00
Zequan Wu	cab48e2f0e	[CodeGen] don't emit addrsig symbol if it's used only by metadata Value only used by metadata can be removed from .addrsig table. This solves the undefined symbol error when enabling addrsig table on COFF LTO. Differential Revision: https://reviews.llvm.org/D101512	2021-04-29 15:39:30 -07:00
Amara Emerson	96ec6d91e4	[AArch64][GlobalISel] Simplify out of range rotate amount. Differential Revision: https://reviews.llvm.org/D101005	2021-04-29 14:05:58 -07:00
Alexey Bataev	12c51f2358	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 12:48:00 -07:00
Alexey Bataev	6e859f3cd4	Revert "[COST] Improve shuffle kind detection if shuffle mask is provided." This reverts commit `9239932221` to fix a compiler crash on mask checks.	2021-04-29 12:40:33 -07:00
Victor Huang	ae3377c553	[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects - Add new variantKinds for the symbol's variable offset and region handle - Print the proper relocation specifier @gd in the asm streamer when emitting the TC Entry for the variable offset for the symbol - Fix the switch section failure between the TC Entry of variable offset and region handle - Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D100956	2021-04-29 13:18:59 -05:00
Benjamin Kramer	df323ba445	Revert "[X86] Support AMX fast register allocation" This reverts commit `3b8ec86fd5`. Revert "[X86] Refine AMX fast register allocation" This reverts commit `c3f95e9197`. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Alexey Bataev	9239932221	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 09:42:56 -07:00
Sanjay Patel	f4b1272d3d	[ADT] fix typo in code block comment; NFC	2021-04-29 12:09:22 -04:00
Anirudh Prasad	ded0a70aeb	[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS - Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter) - However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch) - The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not. - It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`) Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100975	2021-04-29 11:58:54 -04:00
Sander de Smalen	51d648c119	Revert "[LV] Calculate max feasible scalable VF." Temporarily reverting this patch due to some unexpected issue found by one of the PPC buildbots. This reverts commit `584e9b6e4b`.	2021-04-29 16:04:37 +01:00
Chirag Khandelwal	fbd3548d1c	[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder This patch adds section support in the OpenMP IRBuilder module, along with a test for the same. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D89671	2021-04-29 18:39:49 +05:30
Amara Emerson	d138e97c2a	[GlobalISel] Bump CallLoweringInfo::OrigArgs initial size to 32. NFC. We spend some time during sqlite3 compilation regrowing this vector, bump it up to avoid this. Gives around 1-2% improvement in codegen-only time for sqlite3 at -O0.	2021-04-29 01:01:29 -07:00
Evgeny Leviant	6a0283d0d2	[NewPM] Add an option to dump pass structure Patch adds -debug-pass-structure option to dump pass structure when new pass manager is used. Differential revision: https://reviews.llvm.org/D99599	2021-04-29 10:29:42 +03:00
Bardia Mahjour	ddb3b26a12	[LV] Consider Loop Unroll Hints When Making Interleave Decisions This patch causes the loop vectorizer to not interleave loops that have nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)). Note that if a particular interleave count is being requested (through llvm.loop.interleave_count), it will still be honoured, regardless of the presence of nounroll hints. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101374	2021-04-28 17:27:52 -04:00
Anirudh Prasad	07b0a72d8e	[AsmParser][SystemZ][z/OS] Use updated framework in AsmLexer to accept special tokens as Identifiers - Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers. - These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ. - In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_\|$\|@\|#] are already supported as "part" of an Identifier. - The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100959	2021-04-28 15:43:24 -04:00
Jonas Devlieghere	625bd94c6d	[dsymutil] Add flag to force a static variable to keep its enclosing function Add a flag to change dsymutil's behavior and force a static variable to keep its enclosing function. The test shows a situation where that could be useful. I'm not convinced this behavior makes sense as a default, which is why it's behind a flag. rdar://74918374 Differential revision: https://reviews.llvm.org/D101337	2021-04-28 11:33:04 -07:00
David Candler	b8baa2a913	[ARM][AArch64] Require appropriate features for crypto algorithms This patch changes the AArch32 crypto instructions (sha2 and aes) to require the specific sha2 or aes features. These features have already been implemented and can be controlled through the command line, but do not have the expected result (i.e. `+noaes` will not disable aes instructions). The crypto feature retains its existing meaning of both sha2 and aes. Several small changes are included due to the knock-on effect this has: - The AArch32 driver has been modified to ensure sha2/aes is correctly set based on arch/cpu/fpu selection and feature ordering. - Crypto extensions are permitted for AArch32 v8-R profile, but not enabled by default. - ACLE feature macros have been updated with the fine grained crypto algorithms. These are also used by AArch64. - Various tests updated due to the change in feature lists and macros. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D99079	2021-04-28 16:26:18 +01:00
Paul C. Anagnostopoulos	952c6ddd8b	[TableGen] Add the !find bang operator !find searches a source string for a target string and returns the position. Differential Revision: https://reviews.llvm.org/D101318	2021-04-28 09:51:00 -04:00
Matt Arsenault	cea97fc0fc	GlobalISel: Relax verification of physical register copy types This was picking a concrete size for a physical register, and enforcing exact match on the virtual register's type size. Some targets add multiple types to a register class, and some are smaller than the full bit width. For example x86 adds f32 to 128-bit xmm registers, and AMDGPU adds i16/f16 to 32-bit registers. It might be better to represent these cases as a copy of the full register and an extraction of the subpart, but a lot of code assumes you can directly copy. This will help fix the current usage of the DAG calling convention infrastructure which is incompatible with how GlobalISel is now using it. The API is somewhat cumbersome here, but I just mirrored the existing functions, except now with LLTs (and allow returning null on failure, unlike the MVT version). I think the concept of selecting register classes based on type is flawed to begin with, but I'm trying to keep this compatible with the existing handling.	2021-04-28 08:45:41 -04:00
Sander de Smalen	584e9b6e4b	[LV] Calculate max feasible scalable VF. This patch also refactors the way the feasible max VF is calculated, although this is NFC for fixed-width vectors. After this change scalable VF hints are no longer truncated/clamped to a shorter scalable VF, nor does it drop the 'scalable flag' from the suggested VF to vectorize with a similar VF that is fixed. Instead, the hint is ignored which means the vectorizer is free to find a more suitable VF, using the CostModel to determine the best possible VF. Reviewed By: c-rhodes, fhahn Differential Revision: https://reviews.llvm.org/D98509	2021-04-28 12:30:00 +01:00
Benjamin Kramer	7e5682ee62	[ADT] Make TrackingStatistic's ctor constexpr This lets clang diagnose unused statistics, so remove them.	2021-04-28 12:00:17 +02:00
RamNalamothu	63cfab4f40	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-28 09:04:04 +05:30
Craig Topper	3067520bf4	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Roman Lebedev	15f631cc78	[NFC][IR] PHINode: ... and assert in another ctor too	2021-04-27 20:52:44 +03:00
Roman Lebedev	1ebbf84ba4	[NFC][IR] PHINode: assert we aren't trying to create token-typed PHI Verifier will complain, but by then it may be too late, because we might have never reached it because we already crashed with some bogus bug. It is best to catch this the moment it happens.	2021-04-27 20:49:42 +03:00
Nick Desaulniers	ea8416bf4d	[CodeGenOptions] make StackProtectorGuardOffset signed GCC supports negative values for -mstack-protector-guard-offset=, this should be a signed value. Pre-req to D100919. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101325	2021-04-27 10:12:58 -07:00
Victor Huang	241c2da406	[AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary This patch adds the support to restrict prefixed instruction from crossing the 64 byte boundary: - Add the infrastructure to register a custom XCOFF streamer - Add a custom XCOFF streamer for PowerPC to allow us to intercept instructions as they are being emitted and align all 8 byte instructions to a 64 byte boundary if required by adding a 4 byte nop. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D101107	2021-04-27 11:55:18 -05:00
Nico Weber	21da04f701	[llvm, clang] Remove stdlib includes from .h files without `std::` Found files not containing `std::` with: INCL="algorithm\|array\|list\|map\|memory\|queue\|set\|string\|utility\|vector\|unordered_map\|unordered_set" git ls-files llvm/include/llvm \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" > to_process.txt git ls-files clang/include/clang \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" >> to_process.txt Then removed these headers from those files with INCL_ESCAPED="$(echo $INCL\|sed 's/\|/\\\|/g')" cat to_process.txt \| xargs sed -i "/^#include <$$INCL_ESCAPED$>$/d" cat to_process.txt \| xargs sed -i '/^$/N;/^\n$/D' No behavior change. Differential Revision: https://reviews.llvm.org/D101378	2021-04-27 12:41:39 -04:00
Petar Avramovic	4c6eb3886c	[MIPatternMatch]: Add matchers for binary instructions Add matchers that support commutative and non-commutative binary opcodes. Differential Revision: https://reviews.llvm.org/D99736	2021-04-27 11:37:42 +02:00
Petar Avramovic	39662abf72	[MIPatternMatch]: Add mi_match for MachineInstr This utility allows more efficient start of pattern match. Often MachineInstr(MI) is available and instead of using mi_match(MI.getOperand(0).getReg(), MRI, ...) followed by MRI.getVRegDef(Reg) that gives back MI we now use mi_match(MI, MRI, ...). Differential Revision: https://reviews.llvm.org/D99735	2021-04-27 11:08:16 +02:00
Petar Avramovic	ebe408ad80	[MIPatternMatch]: Add ICstRegMatch Matches G_CONSTANT and returns its def register. Differential Revision: https://reviews.llvm.org/D99734	2021-04-27 10:53:17 +02:00
Petar Avramovic	0713c82b13	[GlobalISel]: Add a getConstantIntVRegVal utility Returns ConstantInt from G_CONSTANT instruction given its def register. Differential Revision: https://reviews.llvm.org/D99733	2021-04-27 10:52:07 +02:00
dfukalov	e4c606acaf	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Lang Hames	a702fa2a04	[ORC] Make LLVMOrcLLJITBuilderSetJITTargetMachineBuilder consume as advertised. This should fix some of the memory leaks seen in the ORC C API test case.	2021-04-26 22:26:38 -07:00
Chen Zheng	e5000eef81	[XCOFF] make .file directive have directory info The .file directive is changed to only have basename in D36018 for ELF. But on AIX, we require the .file directive to also contain the directory info. This aligns with other AIX compiler like XLC and is required by some AIX tool like DBX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D99785	2021-04-27 00:15:23 -04:00
Lang Hames	d122d80b3d	Reapply "[ORC] Add unit tests for parts of the ..." with fixes and improvements. This reapplies `8740360093`, which was reverted in `bbddadd46e` due to buildbot errors. This version checks that a JIT instance can be safely constructed, skipping tests if it can not be. To enable this it introduces new C API to retrieve and set the target triple for a JITTargetMachineBuilder.	2021-04-26 20:44:40 -07:00
William S. Moses	7aa3cad46a	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 20:12:12 -04:00
Fangrui Song	18839be9c5	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 mostly unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. For now this patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with `llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings. Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which calls `getName`/`getDesc`/`getValue`. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 16:47:32 -07:00
William S. Moses	8ede96493c	Revert "[NVPTX] Enable lowering of atomics on local memory" This reverts commit `fede99d386`.	2021-04-26 19:33:01 -04:00
William S. Moses	fede99d386	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 19:27:27 -04:00
Lei Zhang	254e289d45	Revert "[ADT] Remove StatisticBase and make NoopStatistic empty" This reverts commit `b540311781` because it breaks MLIR build: https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b	2021-04-26 18:31:04 -04:00
Fangrui Song	e01c666b13	Revert D76519 "[NFC] Refactor how CFI section types are represented in AsmPrinter" This reverts commit `0ce723cb22`. D76519 was not quite NFC. If we see a CFISection::Debug function before a CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`. We should use .eh_frame instead. This scenario is untested.	2021-04-26 15:17:28 -07:00
Lang Hames	c8fc5e3ba9	[ORC] C API updates. Adds support for creating custom MaterializationUnits in the C API with the new LLVMOrcCreateCustomMaterializationUnit function. Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any clients of the LLVMOrcAbsoluteSymbols API. Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to allow clients to get a reference to an LLJIT instance's linking layer, then emit an object file using it. This can be used to support construction of custom materialization units in the common case where those units will generate an object file that needs to be emitted to complete the materialization.	2021-04-26 13:58:37 -07:00
Lang Hames	8d718a0bff	[ORC] Fix type name. Rename JITTargetSymbolFlags to JITSymbolTargetFlags. This matches the convention used for JITSymbolGenericFlags.	2021-04-26 13:58:37 -07:00
Fangrui Song	b540311781	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful and not noisy, we can replace `llvm::Statistic` with `llvm::TrackingStatistic` in the future. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 13:39:35 -07:00
William S. Moses	494e77138c	[Lexer] Allow LLLexer to be used as an API Explose LLVM Lexer for usage externally as an API Differential Revision: https://reviews.llvm.org/D100920	2021-04-26 12:43:14 -04:00
Paul C. Anagnostopoulos	2d4c4d3c54	[TableGen] Change assertion information from a tuple to a struct [NFC] Differential Revision: https://reviews.llvm.org/D100854	2021-04-26 09:57:16 -04:00
Tim Renouf	8710eff6c3	[MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly 1. Add an accessor function to MCSymbolizer to retrieve addresses referenced by a symbolizable operand, but not resolved to a symbol. That way, the caller can synthesize labels at those addresses and then retry disassembling the section. 2. Implement that in AMDGPU -- a failed symbol lookup results in the address being added to a vector returned by the new function. 3. Use that in llvm-objdump when using MCSymbolizer (which only happens on AMDGPU) and SymbolizeOperands is on. Differential Revision: https://reviews.llvm.org/D101145 Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e	2021-04-26 13:56:36 +01:00
David Sherwood	a458b7855e	[AArch64] Add AArch64TTIImpl::getMaskedMemoryOpCost function When vectorising for AArch64 targets if you specify the SVE attribute we automatically then treat masked loads and stores as legal. Also, since we have no cost model for masked memory ops we believe it's cheap to use the masked load/store intrinsics even for fixed width vectors. This can lead to poor code quality as the intrinsics will currently be scalarised in the backend. This patch adds a basic cost model that marks fixed-width masked memory ops as significantly more expensive than for scalable vectors. Tests for the cost model are added here: Transforms/LoopVectorize/AArch64/masked-op-cost.ll Differential Revision: https://reviews.llvm.org/D100745	2021-04-26 11:00:03 +01:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Tomasz Miąsko	1cea7ab4ba	[demangler] Use standard semantics for StringView::substr The StringView::substr now accepts a substring starting position and its length instead of previous non-standard `from` & `to` positions. All uses of two argument StringView::substr are in MicrosoftDemangler and have 0 as a starting position, so no changes are necessary. This also fixes a bug where attempting to extract a suffix with substr (a `to` position equal to size) would return a substring without the last character. Fixing the issue should not introduce observable changes in the demangler, since as currently used, a second argument to StringView::substr is either: 1) a result of a successful call to StringView::find and so necessarily smaller than size., or 2) in the case of Demangler::demangleCharLiteral potentially equal to size, but with demangler expecting more data to follow later on and failing either way. Reviewed By: #libc_abi, ldionne, erik.pilkington Differential Revision: https://reviews.llvm.org/D100246	2021-04-25 13:56:41 +02:00
Xiang1 Zhang	3b8ec86fd5	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Lang Hames	c572ff840f	[ORC][C-bindings] Fix missing ')' in comments.	2021-04-24 18:04:57 -07:00
Nikita Popov	95af971764	[PatternMatch] Improve m_Deferred() documentation (NFC) m_Deferred() has nothing to do with commutative matchers, it needs to be used whenever the value to match is determinde as part of the same match expression.	2021-04-24 21:00:24 +02:00
RamNalamothu	0ce723cb22	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-24 23:29:42 +05:30
dfukalov	6c57044231	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Teresa Johnson	10b781fb03	Mark type test intrinsics as speculatable to fix inline cost There is already code in InlineCost.cpp to identify and ignore ephemeral values (llvm.assume intrinsics and other side-effect free instructions only feeding the assumes). However, because llvm.type.test intrinsics were not marked speculatable, they and any instructions specifically feeding the type test (typically a bitcast) were being counted towards the instruction cost when inlining. This was causing profile matching issues in some cases when enabling -fwhole-program-vtables for whole program devirtualization. According to the language reference, the speculatable attribute means: "the function does not have any effects besides calculating its result and does not have undefined behavior". I see no reason why type tests cannot be marked with this attribute. There are 2 test changes: llvm/test/Transforms/Inline/ephemeral.ll: I added a type test intrinsic here to verify the fix. Also, I found the test was not actually testing what it originally intended. Many of the existing instructions were optimized away by -Oz, and the cost of inlining was negative due to the benefit of removing the call. So I changed the test to simply invoke the inline pass and check the number of instructions computed by InlineCost. I also fixed an instruction that was not actually used anywhere. llvm/test/Transforms/SimplifyCFG/no-md-sink.ll needed to be made more robust to code changes that reordered the metadata. Differential Revision: https://reviews.llvm.org/D101180	2021-04-23 10:02:31 -07:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Sander de Smalen	f9a50f04ba	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Sander de Smalen	43ace8b5ce	[TTI] NFC: Change getScalingFactorCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100564	2021-04-23 16:06:36 +01:00
Sander de Smalen	008a072ded	[TTI] NFC: Change getMemcpyCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100563	2021-04-23 16:06:35 +01:00
Sander de Smalen	9ba07f37f8	[TTI] NFC: Change getGEPCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100562	2021-04-23 16:06:35 +01:00
Sander de Smalen	e0edfa052f	[TTI] NFC: Change getAddressComputationCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100561	2021-04-23 16:06:35 +01:00
dfukalov	9ab17a60eb	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Daniil Fukalov	f79d055791	[TTI] Fix ScalarizationCost initialization. In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101099	2021-04-23 17:59:59 +03:00
Stephen Tozer	791930d740	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in `e5d844b587`. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit `dad5caa59e`.	2021-04-23 10:54:01 +01:00
Jay Foad	63af3c000b	[GlobalISel] Remove ConstantFoldingMIRBuilder ConstantFoldingMIRBuilder was an experiment which is not used for anything. The constant folding functionality is now part of CSEMIRBuilder. Differential Revision: https://reviews.llvm.org/D101050	2021-04-23 09:13:27 +01:00
Fangrui Song	2786e673c7	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Craig Topper	e01c419ecd	[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. These instructions don't really exist, but we have ways we can emulate them. .vv will swap operands and use vmsle().vv. .vi will adjust the immediate and use .vmsgt(u).vi when possible. For .vx we need to use some of the multiple instruction sequences from the V extension spec. For unmasked vmsge(u).vx we use: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd For cases where mask and maskedoff are the same value then we have vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that requires a temporary so we use: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt For other masked cases we use this sequence: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 We trust that register allocation will prevent vd in vmslt{u}.vx from being v0 since v0 is still needed by the vmxor. Differential Revision: https://reviews.llvm.org/D100925	2021-04-22 10:44:38 -07:00
Fangrui Song	ef5e7f90ea	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit `77ac823fd2`. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Krzysztof Parzyszek	deda60fcaf	[Hexagon] Add HVX intrinsics for conditional vector loads/stores Intrinsics for the following instructions are added. The intrinsic name is "int_hexagon_<inst>[_128B]", e.g. int_hexagon_V6_vL32b_pred_ai for 64-byte version int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32 V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32	2021-04-22 11:49:29 -05:00
Irina Dobrescu	123ae42566	[flang][openmp] Add General Semantic Checks for Allocate Directive This patch adds semantic checks for the General Restrictions of the Allocate Directive. Since the requires directive is not yet implemented in Flang, the restriction: ``` allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit ``` will need to be updated at a later time. A different patch will be made with the Fortran specific restrictions of this directive. I have used the code from https://reviews.llvm.org/D89395 for the CheckObjectListStructure function. Co-authored-by: Isaac Perry <isaac.perry@arm.com> Reviewed By: clementval, kiranchandramohan Differential Revision: https://reviews.llvm.org/D91159	2021-04-22 16:15:06 +00:00
Simon Pilgrim	41091614d6	[LTO] Caching.h - remove unused <string> include. NFCI.	2021-04-22 14:07:12 +01:00
Jun Ma	978eb3f168	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Jay Foad	82d34fe2b3	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Wenlei He	dff8315892	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Max Kazantsev	8fe62b7af1	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	77ac823fd2	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Hongtao Yu	1a719089a8	[CSSPGO][llvm-profgen] Always report dangling probes for frames with real samples. Report dangling probes for frames that have real samples collected. Dangling probes are the probes associated to an empty block. When reported, sample count on a dangling probe will not be trusted by the compiler and we will rely on the counts inference algorithm to get the probe a reasonable count. This actually fixes a bug where previously only those dangling probes with samples collected were reported. This patch also fixes two existing issues. Pseudo probes are stored in `Address2ProbesMap` and their pointers are used in `PseudoProbeInlineTree`. Previously `std::vector` was used to store probes and the pointers to probes may get obsolete as the vector grows. I'm changing `std::vector` to `std::list` instead. The other issue is that all outlined functions shared the same inline frame previously due to the unchanged `Index` value as the dummy inlineSite identifier. Good results seen for SPEC2017 in general regarding profile quality. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D100235	2021-04-21 18:07:58 -07:00
Fangrui Song	ac303795a7	[IR] Add doc about Function::createWithDefaultAttr. NFC	2021-04-21 16:20:50 -07:00
Fangrui Song	775a9483e5	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
dfukalov	a8b35e0f52	[TTI] NFC: Change getVectorSplitCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100952	2021-04-21 17:32:02 +03:00
Anirudh Prasad	8f6185c713	[AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string. - Previously, https://reviews.llvm.org/D72680 introduced a new attribute called `AllowSymbolAtNameStart` (in relation to the MAsmParser changes) in `MCAsmInfo.h` which (according to the comment in the header) allows the following behaviour: ``` /// This is true if the assembler allows $ @ ? characters at the start of /// symbol names. Defaults to false. ``` - However, the usage of this field in AsmLexer.cpp doesn't seem completely accurate* for a couple of reasons. ``` default: if (MAI.doesAllowSymbolAtNameStart()) { // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]* if (!isDigit(CurChar) && isIdentifierChar(CurChar, MAI.doesAllowAtInName(), AllowHashInIdentifier)) return LexIdentifier(); } ``` 1. The Dollar and At tokens, when occurring at the start of the string, are treated as separate tokens (AsmToken::Dollar and AsmToken::At respectively) and not lexed as an Identifier. 2. I'm not too sure why `MAI.doesAllowAtInName()` is used when `AllowAtInIdentifier` could be used. For X86 platforms, afaict, this shouldn't be an issue, since the `CommentString` attribute isn't "@". (alternatively the call to the setter can be set anywhere else as needed). The `AllowAtInName` does have an additional important meaning, but in the context of AsmLexer, shouldn't mean anything different compared to `AllowAtInIdentifier` My proposal is the following: - Introduce 3 new fields called `AllowQuestionTokenAtStartOfString`, `AllowDollarTokenAtStartOfString` and `AllowAtTokenAtStartOfString` in MCAsmInfo.h which will encapsulate the previously documented behaviour of "allowing $, @, ? characters at the start of symbol names") - Introduce these fields where "$", "@" are lexed, and treat them as identifiers depending on whether `Allow[Dollar\|At]TokenAtStartOfString` is set. - For the sole case of "?", append it to the existing logic for treating a "default" token as an Identifier. z/OS (HLASM) will also make use of some of these fields in follow up patches. completely accurate* - This was based on the comments and the intended behaviour the code. I might have completely misinterpreted it, and if that is the case my sincere apologies. We can close this patch if necessary, if there are no changes to be made :) Depends on https://reviews.llvm.org/D99374 Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D99889	2021-04-21 10:21:09 -04:00
Nico Weber	ba7a92c01e	[Support] Don't include VirtualFileSystem.h in CommandLine.h CommandLine.h is indirectly included in ~50% of TUs when building clang, and VirtualFileSystem.h is large. (Already remarked by jhenderson on D70769.) No behavior change. Differential Revision: https://reviews.llvm.org/D100957	2021-04-21 10:19:01 -04:00
Simon Pilgrim	68b9b769b5	[MC] MCInstrDesc.h - remove unnecessary <string> include. NFCI.	2021-04-21 15:07:00 +01:00
Fraser Cormack	e6ff89dc2e	[SelectionDAG] Fix minor typo in ISDOpcodes.h. NFC	2021-04-21 14:38:07 +01:00
serge-sans-paille	d9806334d1	Use SmallVector instead of std::vector to manage storage of llvm::BitVector This is a follow-up to https://reviews.llvm.org/D100387. std::vector is not the best storage container here. My local benchmark (counting the number of instruction when compiling the sqlite3 amalgamation) yields the following: - std::vector<BitVector> -> 5,860,885,896 - SmallVector<BitWord, 0> -> 5,858,991,997 - SmallVector<BitWord> -> 5,817,679,224 Differential Revision: https://reviews.llvm.org/D100744	2021-04-21 07:31:28 +02:00
Arthur Eubanks	dd56715326	[NFC] Remove redundant InstCombinePass name	2021-04-20 22:23:07 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Roman Lebedev	7186764884	[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr()	2021-04-20 21:29:21 +03:00
Joseph Huber	b2ad63d3cf	[OpenMP] Add OpenMPOpt as a Module pass Summary: This patch registers OpenMPOpt as a Module pass in addition to a CGSCC pass. This is so certain optimzations that are sensitive to intact call-sites can happen before inlining. The old `openmpopt` pass name is changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass. The current module pass only runs a single check but will be expanded in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99202	2021-04-20 12:28:58 -04:00
Simon Pilgrim	2a419a0b99	[X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements	2021-04-20 17:09:49 +01:00
Ahmed Bougacha	a8a3a43792	[AArch64] Add apple-m1 CPU, and default to it for macOS. apple-m1 has the same level of ISA support as apple-a14, so this is a straightforward mechanical change. However, that also means this inherits apple-a14's v8.5a+nobti quirkiness. rdar://68287159	2021-04-20 08:41:04 -07:00
Matt Arsenault	83a25a1010	GlobalISel: Restrict narrow scalar for fptoui/fptosi results This practically only works for the f16 case AMDGPU uses, not wider types. Fixes bug 49710 by failing legalization.	2021-04-20 10:54:40 -04:00
Andrea Di Biagio	2226d21896	[MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups. Make sure that the `CriticalMemoryInstruction` of a memory group is invalidated if it references an already executed instruction. This avoids a potential use-after-free if the critical memory info becomes stale, and the value is read after the instruction has executed.	2021-04-20 13:30:45 +01:00
Cullen Rhodes	8a6772f3aa	[ValueTypes] Fix sizes of v256i32 and v256f32 (8182 -> 8192)	2021-04-20 12:10:02 +00:00
Simon Pilgrim	5ed8cea9a8	[Support] APInt.h - remove <algorithm> include. NFCI. Replace std::min use which should allow us to avoid including the <algorithm> header in every include of APInt.h.	2021-04-20 11:21:39 +01:00
Simon Pilgrim	1c6df71a9b	[CodeGen] CodeGenPassBuilder.h - remove unnecessary <string> include. NFCI. We only use StringRef so include that.	2021-04-20 11:21:39 +01:00
Simon Pilgrim	2ea6ed9b70	[Support] BinaryStreamReader.h - remove unnecessary <string> include. NFCI. We only use StringRef so include that.	2021-04-20 10:31:12 +01:00
Arthur Eubanks	5e71b9fa93	Explicitly pass type to cast load constant folding result Previously we would use the type of the pointee to determine what to cast the result of constant folding a load. To aid with opaque pointer types, we should explicitly pass the type of the load rather than looking at pointee types. ConstantFoldLoadThroughBitcast() converts the const prop'd value to the proper load type (e.g. [1 x i32] -> i32). Instead of calling this in every intermediate step like bitcasts, we only call this when we actually see the global initializer value. In some existing uses of this API, we don't know the exact type we're loading from immediately (e.g. first we visit a bitcast, then we visit the load using the bitcast). In those cases we have to manually call ConstantFoldLoadThroughBitcast() when simplifying the load to make sure that we cast to the proper type. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100718	2021-04-20 00:53:21 -07:00
Fraser Cormack	457da7f298	[SelectionDAG] Relax constraints on STEP_VECTOR step operand This patch relaxes the requirement that the STEP_VECTOR step constant must be of a type at least as large as the vector element type. This does not permit its use on targets which have legal vector element types larger than the largest legal scalar type, such as i64 vectors on RV32. As such, the requirement has been loosened so that the step operand must be any scalar type so long as the constant immediate is non-negative and the value fits inside the vector element type. This limits combining optimizations in certain circumstances but in practice it's unlikely to be a hindrance. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D100660	2021-04-20 08:41:42 +01:00
Hongtao Yu	1812319292	[CSSPGO] Flip SkipPseudoOp to true for MIR APIs. Flipping the default value of SkipPseudoOp to true for those MIR APIs to favor maximum performance. Note that certain spots like branch folding and MIR if-conversion is are disabled for better counts quality. For these two optimizations, this is a no-diff change. The counts quality with SPEC2017 before/after this change is unchanged. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100332	2021-04-19 17:55:34 -07:00
David Penry	ca8eef7e3d	[CodeGen] Use ProcResGroup information in SchedBoundary When the ProcResGroup has BufferSize=0, 1. if there is a subunit in the list of write resources for the scheduling class, do not attempt to schedule the ProcResGroup. 2. if there is not a subunit in the list of write resources for the scheduling class, choose a subunit to use instead of the ProcResGroup. 3. having both the ProcResGroup and any of its subunits in the resources implied by a InstRW is not supported. Used to model parallel uses from a pool of resources. Differential Revision: https://reviews.llvm.org/D98976	2021-04-19 21:27:45 +01:00
Jinsong Ji	d88d8c5b86	[PowerPC] Disable relative lookup table converter pass for AIX XCOFF hasn't implemented lowerRelativeReference. So we need to disable new pass introduced by https://reviews.llvm.org/D94355 for AIX for now. Reviewed By: gulfem Differential Revision: https://reviews.llvm.org/D100584	2021-04-19 19:28:11 +00:00
Nick Desaulniers	c440b97d89	[TargetLowering] move "o" and "X" constraint handling to base class These constraints are machine agnostic; there's no reason to handle these per-arch. If arches don't support these constraints, then they will fail elsewhere during instruction selection. We don't need virtual calls to look these up; TargetLowering::getInlineAsmMemConstraint should only be overridden by architectures with additional unique memory constraints. Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D100416	2021-04-19 10:53:31 -07:00
Jessica Paquette	91bbb914e0	[AArch64][GlobalISel] Regbankselect + select @llvm.aarch64.neon.uaddlv It turns out we actually import a bunch of selection code for intrinsics. The imported code checks that the register banks on the G_INTRINSIC instruction are correct. If so, it goes ahead and selects it. This adds code to AArch64RegisterBankInfo to allow us to correctly determine register banks on intrinsics which have known register bank constraints. For now, this only handles @llvm.aarch64.neon.uaddlv. This is necessary for porting AArch64TargetLowering::LowerCTPOP. Also add a utility for getting the intrinsic ID from a G_INTRINSIC instruction. This seems a little nicer than having to know about how intrinsic instructions are structured. Differential Revision: https://reviews.llvm.org/D100398	2021-04-19 10:47:49 -07:00
Roman Lebedev	b8a3705896	[NFCI][SCEVExpander] Extract GetOptimalInsertionPointForCastOf() helper	2021-04-19 18:38:38 +03:00
Simon Pilgrim	ddcdeae358	[Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI. Move <string> include to ImportedFunctionsInliningStatistics.cpp and add missing <memory> include as we have explicit uses of std::unique_ptr in the header.	2021-04-19 16:20:56 +01:00
Simon Pilgrim	3b02de173b	[Support] Memory.h - remove unnecessary <string> include. NFCI. protectMappedMemory no longer returns an error message, so we don't need std::string - I've fixed an unnecessary doxygen entry as well (oddly I wasn't seeing a Wdocumentation warning)	2021-04-19 14:32:07 +01:00
Abhina Sreeskantharajan	05b4babc9d	[SystemZ][z/OS] Set more text files as text This patch corrects more instances of text files being opened as text. Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D100654	2021-04-19 09:31:46 -04:00
Paul C. Anagnostopoulos	a5aaec8f4e	[TableGen] Add support for the 'assert' statement in multiclasses This is step 3 of adding the 'assert' statement. Differential Revision: https://reviews.llvm.org/D99751	2021-04-19 09:01:42 -04:00
Simon Pilgrim	cf2fc41bd1	[IR] GlobalObject.h - remove unused <utility> include. NFCI. In fact there's no explicit use of any std:: type or method in this header.	2021-04-19 13:25:35 +01:00
Simon Pilgrim	228207fe94	[IR] GlobalObject.h - remove unused <string> include. NFCI. All string usage is hidden behind StringRefs - so we don't need an explicit <string> include.	2021-04-19 12:56:10 +01:00
Simon Pilgrim	7f0ea5c8b6	[MCA] CodeEmitter.h - remove unused <string> include. NFCI. Add explicit SmallString.h include - which is used in the header	2021-04-19 12:56:09 +01:00
Cullen Rhodes	f0bc2782f2	[TTI] NFC: Remove unused 'OptSize' parameter from shouldMaximizeVectorBandwidth Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100377	2021-04-19 11:01:34 +00:00
OCHyams	0ebf9a8e34	[DebugInfo] Move the findDbg* functions into DebugInfo.cpp Move the findDbg* functions into lib/IR/DebugInfo.cpp from lib/Transforms/Utils/Local.cpp. D99169 adds a call to a function (findDbgUsers) that lives in lib/Transforms/Utils/Local.cpp (LLVMTransformUtils) from lib/IR/Value.cpp (LLVMCore). The Core lib doesn't include TransformUtils. The builtbots caught this here: https://lab.llvm.org/buildbot/#/builders/109/builds/12664. This patch moves the function, and the 3 similar ones for consistency, into DebugInfo.cpp which is part of LLVMCore. Reviewed By: dblaikie, rnk Differential Revision: https://reviews.llvm.org/D100632	2021-04-19 10:30:25 +01:00
Roman Lebedev	d480f968ad	Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`" As being discussed in https://reviews.llvm.org/D100721, this modelling is lossy, we can't reconstruct `ash`/`ashr exact` from it, which means that whenever we actually expand the IR, we've just pessimized the code.. It would be good to model this pattern, after all it comes up every time you want to compute a distance between two pointers, but not at this cost. This reverts commit `ec54867df5`.	2021-04-18 16:26:45 +03:00
Juneyoung Lee	2813acb7d1	Update m_Undef to match vectors/aggrs with undefs and poisons mixed This fixes https://reviews.llvm.org/D93990#2666922 by teaching `m_Undef` to match vectors/aggrs with poison elements. As suggested, fixes in InstCombine files to use the `m_Undef` matcher instead of `isa<UndefValue>` will be followed. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100122	2021-04-18 10:57:04 +09:00
Florian Hahn	d91f864ced	[ADT] Update RPOT to work with specializations of different types. At the moment, ReversePostOrderTraversal performs a post-order walk on the entry node of the passed in graph, rather than the graph type itself. If GT::NodeRef is the same as GraphT, everything works as expected and this is the case for the current uses in-tree. But it does not work as expected if GraphT != GT::NodeRef. In that case, we either fail to build (if there is no GraphTrait specialization for GT:NodeRef) or we pick the GraphTrait specialization for GT::NodeRef, instead of the specialization of GraphT. Both the depth-first and post-order iterators pick the expected specalization and this patch updates ReversePostOrderTraversal to delegate to po_begin & po_end to pick the right specialization, rather than forcing using GraphTraits<GT::NodeRef>, by first getting the entry node. This makes `ReversePostOrderTraversal<Graph<6>> RPOT(G);` build and work as expected in the test. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D100169	2021-04-17 20:45:04 +01:00
Florian Hahn	bbf01f96b5	[ADT] Take graph as const & in some post-order iterators (NFC). This patch updates a couple of functions that unnecessarily took the input graph by value, when it was not needed. They can take the graph by const-reference instead, which does not require GraphT to provide a copy constructor. Split off from D100169.	2021-04-17 17:05:24 +01:00
Simon Pilgrim	595394321d	[Support] AbsoluteDifference - add brackets to appease static analyzer warning. NFCI.	2021-04-17 13:47:02 +01:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
serge-sans-paille	550ed575cb	Simplify BitVector code Instead of managing memory by hand, delegate it to std::vector. This makes the code much simpler, and also avoids repeatedly computing the storage size. According to valgrind --tool=callgrind, this also slightly decreases the instruction count, but by a small margin. This is a recommit of `82f0e3d3ea` with one usage fixed in llvm/lib/CodeGen/RegisterScavenging.cpp. Not the slight API change: BitVector::clear() now has the same behavior as any other container: it does not free memory, but indeed sets the size of the BitVector to 0. It is thus incorrect to access its content right afterwards, a scenario which wasn't enforced in previous implementation. Differential Revision: https://reviews.llvm.org/D100387	2021-04-16 22:48:33 +02:00
Thomas Lively	5c729750a6	[WebAssembly] Remove saturating fp-to-int target intrinsics Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596	2021-04-16 12:11:20 -07:00
Nico Weber	0b36a33ab8	Reland "[Support] Don't include <algorithm> in MathExtras.h" This reverts commit `af2a93fd6e`. This time, add the include to APInt.h, which apparently relied on getting this include transitively.	2021-04-16 14:07:45 -04:00
Stella Stamenova	af2a93fd6e	Revert "[Support] Don't include <algorithm> in MathExtras.h" This reverts commit `6580d8a2b1`.	2021-04-16 10:22:32 -07:00
Nick Lewycky	244d9d6e41	Verify the LLVMContext that an Attribute belongs to. Attributes don't know their parent Context, adding this would make Attribute larger. Instead, we add hasParentContext that answers whether this Attribute belongs to a particular LLVMContext by checking for itself inside the context's FoldingSet. Same with AttributeSet and AttributeList. The Verifier checks them with the Module context. Differential Revision: https://reviews.llvm.org/D99362	2021-04-16 09:44:38 -07:00
Stanislav Mekhanoshin	0777d1ec06	Ignore assume like calls by default in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96179	2021-04-16 09:37:33 -07:00
Nico Weber	da62725874	[ADT] Don't include <algorithm> in iterator.h As far as I can tell, nothing in iterator.h uses anything from <algorithm>. Differential Revision: https://reviews.llvm.org/D100659	2021-04-16 12:21:08 -04:00
Michael Liao	853da5977e	Revert "[Support] Don't include <algorithm> in Hashing.h" This reverts commit `ef620c40f3`. - `std::rotate` still needs <alogirthm>	2021-04-16 12:17:42 -04:00
Nico Weber	ef620c40f3	[Support] Don't include <algorithm> in Hashing.h The include is for std::swap(), but that's in <utility> in C++11, and Hashing.h already includes that. Differential Revision: https://reviews.llvm.org/D100657	2021-04-16 12:04:53 -04:00
Nico Weber	6580d8a2b1	[Support] Don't include <algorithm> in MathExtras.h MathExtras.h is indirectly included in over 98% of LLVM's translation units. It currently expands to over 1MB of stuff, over which far more than half is due to <algorithm>. Since not using <algorithm> is slightly less code, do that. No behavior change. Differential Revision: https://reviews.llvm.org/D100656	2021-04-16 11:53:31 -04:00
Mats Petersson	517c3aee4d	[OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic The implementation supports static schedule for Fortran do loops. This implements the dynamic variant of the same concept. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D97393	2021-04-16 16:09:49 +01:00
Abhina Sreeskantharajan	3be2ba0ba3	[SystemZ][z/OS][Windows] Add new functions that set Text/Binary mode for Stdin and Stdout based on OpenFlags On Windows, we want to open a file in Binary mode if OF_CRLF bit is not set. On z/OS, we want to open a file in Binary mode if the OF_Text bit is not set. This patch creates two new functions called ChangeStdinMode and ChangeStdoutMode which will take OpenFlags as an arg to determine which mode to set stdin and stdout to. This will enable patches like https://reviews.llvm.org/D100056 to not affect Windows when setting the OF_Text flag for raw_fd_streams. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D100130	2021-04-16 08:09:19 -04:00
Arthur Eubanks	9c776c2fa2	[NFC][NewPM] Remove some AnalysisManager invalidate methods These were misleading, they're more of a "clear" than an "invalidate". We shouldn't be individually clearing analysis results. Either we clear all analyses when some IR becomes invalid, or we properly go through invalidation. There was only one use of this, which can be simulated with AM.invalidate(F, PA). Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D100519	2021-04-15 16:51:26 -07:00
River Riddle	706c9c5ce0	[mlir] Add support for walking locations similarly to Operations This allows for walking all nested locations of a given location, and is generally useful when processing locations. Differential Revision: https://reviews.llvm.org/D100437	2021-04-15 16:09:34 -07:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
cchen	e0c2125d1d	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
Arthur Eubanks	c8f0a7c215	[NewPM] Cleanup IR printing instrumentation Being lazy with printing the banner seems hard to reason with, we should print it unconditionally first (it could also lead to duplicate banners if we have multiple functions in -filter-print-funcs). The printIR() functions were doing too many things. I separated out the call from PrintPassInstrumentation since we were essentially doing two completely separate things in printIR() from different callers. There were multiple ways to generate the name of some IR. That's all been moved to getIRName(). The printing of the IR name was also inconsistent, now it's always "IR Dump on $foo" where "$foo" is the name. For a function, it's the function name. For a loop, it's what's printed by Loop::print(), which is more detailed. For an SCC, it's the list of functions in parentheses. For a module it's "[module]", to differentiate between a possible SCC with a function called "module". To preserve D74814, we have to check if we're going to print anything at all first. This is unfortunate, but I would consider this a special case that shouldn't be handled in the core logic. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D100231	2021-04-15 09:50:55 -07:00
LemonBoy	24185541ca	[yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100250	2021-04-15 15:54:28 +02:00
Alex Orlov	49cbf4cd85	Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel This fixes the following bugs: https://bugs.llvm.org/show_bug.cgi?id=27249 https://bugs.llvm.org/show_bug.cgi?id=46414 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100328	2021-04-15 15:06:20 +04:00
dfukalov	ce1626f34a	[AA] Updates for D95543. Addressing latter comments in D95543: - `AliasResult::Result` renamed to `AliasResult::Kind` - Offset printing added for `PartialAlias` case in `-aa-eval` - Removed VisitedPhiBBs check from BasicAA' Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D100454	2021-04-15 12:22:03 +03:00
Nikita Popov	a1ed025d0e	Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting" This reverts commit `faf9f11589`. Issues with this patch have been reported in https://reviews.llvm.org/D100264#2689917 and https://bugs.llvm.org/show_bug.cgi?id=49967.	2021-04-15 09:43:52 +02:00
Nico Weber	d5e8dca1b6	fix comment typos to cycle bots	2021-04-14 22:12:56 -04:00
Alexander Yermolovich	b7459a10da	[DWARF] Fix crash for DWARFDie::dump. When DIE is extracted manually, the DieArray is empty. When dump is invoked on aforementioned DIE it tries to extract child, even if Dump options say otherwise. Resulting in crash. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99698	2021-04-14 18:46:34 -07:00
Sterling Augustine	8f9477b067	Revert "Simplify BitVector code" This reverts commit `82f0e3d3ea`. The change breaks the asan buildbots. https://lab.llvm.org/buildbot/#/builders/99/builds/2835	2021-04-14 18:06:51 -07:00
Philip Reames	dd985551c2	Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches. This reverts commit `ab98f2c712` and `98eea392cd`. It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.	2021-04-14 16:38:07 -07:00
Nico Weber	ab98f2c712	Revert "[InferAttributes] Materialize all infered attributes for declaration" Breaks check-clang, see comments on D100400 Also revert follow-up "[NFC] Move a recently added utility into a location to enable reuse" This reverts commit `3ce61fb6d6`. This reverts commit `61a85da882`.	2021-04-14 18:41:20 -04:00
Philip Reames	3ce61fb6d6	[NFC] Move a recently added utility into a location to enable reuse About to refresh a patch that uses this in FunctionAtrrs, doing the move seperately to control build times.	2021-04-14 15:05:16 -07:00
Thomas Lively	6a18cc23ef	[WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u} Removes the builtins and intrinsics used to opt in to using these instructions and replaces them with normal ISel patterns now that they are no longer prototypes. Differential Revision: https://reviews.llvm.org/D100402	2021-04-14 13:43:09 -07:00
serge-sans-paille	82f0e3d3ea	Simplify BitVector code Instead of managing memory by hand, delegate it to std::vector. This makes the code much simpler, and also avoids repeatedly computing the storage size. According to valgrind --tool=callgrind, this also slightly decreases the instruction count, but by a small margin. Differential Revision: https://reviews.llvm.org/D100387	2021-04-14 21:28:08 +02:00
Thomas Lively	af7925b4dd	[WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u} Add a custom DAG combine and ISD opcode for detecting patterns like (uint_to_fp (extract_subvector ...)) before the extract_subvector is expanded to ensure that they will ultimately lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions are no longer prototypes and can now be produced via standard IR, this commit also removes the target intrinsics and builtins that had been used to prototype the instructions. Differential Revision: https://reviews.llvm.org/D100425	2021-04-14 10:42:45 -07:00
Momchil Velikov	a0124f4e4d	Remove deprecated member functions (NFC) Remove the member functions getByValAlign and getOrigAlign, there were no users left. Differential Revision: https://reviews.llvm.org/D99098	2021-04-14 18:06:53 +01:00
Sander de Smalen	4f42d873c2	[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100317	2021-04-14 17:20:36 +01:00
Sander de Smalen	d84bd951a8	[TTI] NFC: Change getFPOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100316	2021-04-14 17:20:36 +01:00
Sander de Smalen	1af35e77f4	[TTI] NFC: Change getVectorInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100315	2021-04-14 17:20:35 +01:00
Sander de Smalen	174e8f6c5e	[TTI] NFC: Change getShuffleCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100314	2021-04-14 17:20:35 +01:00
Sander de Smalen	14b934f8a6	[TTI] NFC: Change getCFInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D100313	2021-04-14 17:20:34 +01:00
Sander de Smalen	596f669cfb	[TTI] NFC: Change getCallInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100312	2021-04-14 17:20:34 +01:00
Thomas Lively	af7ab81ce3	[WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops Now that these instructions are no longer prototypes, we do not need to be careful about keeping them opt-in and can use the standard LLVM infrastructure for them. This commit removes the bespoke intrinsics we were using to represent these operations in favor of the corresponding target-independent intrinsics. The clang builtins are preserved because there is no standard way to easily represent these operations in C/C++. For consistency with the scalar codegen in the Wasm backend, the intrinsic used to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though @llvm.roundeven better captures the semantics of the underlying Wasm instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is left to a potential future patch. Differential Revision: https://reviews.llvm.org/D100411	2021-04-14 09:19:27 -07:00
Sjoerd Meijer	39d29817f3	[SCCP] Follow up of rGbbab9f986c6d. NFC. This addresses the linter messages, mainly the inconsistent capitalisation of member functions.	2021-04-14 17:14:46 +01:00
Sjoerd Meijer	bbab9f986c	[SCCP] Create SCCP Solver This refactors SCCP and creates a SCCPSolver interface and class so that it can be used by other passes and transformations. We will use this in D93838, which adds a function specialisation pass. This is based on an early version by Vinay Madhusudan. Differential Revision: https://reviews.llvm.org/D93762	2021-04-14 14:58:03 +01:00
Martin Storsjö	57b259a852	[Passes] Enable the relative lookup table converter pass on aarch64 After `d5c5cf5ce8`, it should work fine for aarch64 on COFF too. (It was disabled when the patch was (re)applied in `e96df3e531`, pending that fix.)	2021-04-14 13:15:41 +03:00
Pengfei Wang	184377da5c	[LLD] Implement /guard:[no]ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99078	2021-04-14 15:06:49 +08:00
Daniel Sanders	be50657c6a	[TableGen] Resolve concrete but not complete field access initializers This fixes the resolution of Rec10.Zero in ListSlices.td. As part of this, correct the definition of complete for ListInit such that it's complete iff all the elements in the list are complete rather than always being complete regardless of the elements. This is the reason Rec10.TwoFive from ListSlices.td previously resolved despite being incomplete like Rec10.Zero was Depends on D100247 Reviewed By: Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D100253	2021-04-13 15:14:56 -07:00
Anirudh Prasad	6ddd8c28b7	[AsmParser][SystemZ][z/OS] Add support to AsmLexer to accept HLASM style integers - Add support for HLASM style integers. These are the decimal integers [0-9]. - HLASM does not support the additional prefixed integers like, `0b`, `0x`, octal integers and Masm style integers. - To achieve this, a field `LexHLASMStyleIntegers` (similar to the `LexMasmStyleIntegers` field) is introduced in `MCAsmLexer.h` as well as a corresponding setter. Note: This field could also go into MCAsmInfo.h. I used the previous precedent set by the `LexMasmIntegers` field. Depends on https://reviews.llvm.org/D99286 Reviewed By: epastor Differential Revision: https://reviews.llvm.org/D99374	2021-04-13 15:29:37 -04:00
Nikita Popov	faf9f11589	[SCEV] Don't walk uses of phis without SCEV expression when forgetting I've run into some cases where a large fraction of compile-time is spent invalidating SCEV. One of the causes is forgetLoop(), which walks all values that are def-use reachable from the loop header phis. When invalidating a topmost loop, that might be close to all values in a function. Additionally, it's fairly common for there to not actually be anything to invalidate, but we'll still be performing this walk again and again. My first thought was that we don't need to continue walking the uses if the current value doesn't have a SCEV expression. However, this isn't quite right, because SCEV construction can skip over values (e.g. for a chain of adds, we might only create a SCEV expression for the final value). What this patch does instead is to only walk the (full) def-use chain of loop phis that have a SCEV expression. If there's no expression for a phi, then we also don't have any dependent expressions to invalidate. Differential Revision: https://reviews.llvm.org/D100264	2021-04-13 20:28:17 +02:00
Anirudh Prasad	f7eec83932	[AsmParser][SystemZ][z/OS] Add in support to allow use of additional comment strings. - Currently, MCAsmInfo provides a CommentString attribute, that various targets can set, so that the AsmLexer can appropriately lex a string as a comment based on the set value of the attribute. - However, AsmLexer also supports a few additional comment syntaxes, in addition to what's specified as a CommentString attribute. This includes regular C-style block comments (/* ... /), regular C-style line comments (// .... ) and #. While I'm not sure as to why this behaviour exists, I am assuming it does to maintain backward compatibility with GNU AS (see https://sourceware.org/binutils/docs/as/Comments.html#Comments for reference) For example: Consider a target which sets the CommentString attribute to ''. The following strings are all lexed as comments. ``` "# abc" -> comment "// abc" -> comment "/* abc / -> comment " abc" -> comment ``` - In HLASM however, only "*" is accepted as a comment string, and nothing else. - To achieve this, an additional attribute (`AllowAdditionalComments`) has been added to MCAsmInfo. If this attribute is set to false, then only the string specified by the CommentString attribute is used as a possible comment string to be lexed by the AsmLexer. The regular C-style block comments, line comments and "#" are disabled. As a final note, "#" will still be treated as a comment, if the CommentString attribute is set to "#". Depends on https://reviews.llvm.org/D99277 Reviewed By: abhina.sreeskantharajan, myiwanch Differential Revision: https://reviews.llvm.org/D99286	2021-04-13 11:15:09 -04:00
Sander de Smalen	03f47bdcb1	[TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100205	2021-04-13 14:21:02 +01:00
Sander de Smalen	d676b5749d	[TTI] NFC: Change getMaskedMemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100204	2021-04-13 14:21:01 +01:00
Sander de Smalen	db134e2428	[TTI] NFC: Change getCmpSelInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100203	2021-04-13 14:21:01 +01:00
Sander de Smalen	2285dfb73f	[TTI] NFC: Change getMinMaxReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100202	2021-04-13 14:21:00 +01:00
Sander de Smalen	bd86824d98	[TTI] NFC: Change getArithmeticReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html This patch is practically NFC, with the exception of an AArch64 SVE related cost-model change, where we can now return an Invalid cost instead of some bogus number. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100201	2021-04-13 14:20:59 +01:00
Sander de Smalen	fd1f8a5462	[TTI] NFC: Change getGatherScatterOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100200	2021-04-13 14:20:59 +01:00
Sander de Smalen	92d8421f49	[TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100199	2021-04-13 14:20:58 +01:00
Martin Storsjö	45f8946a75	[CodeView] Fix the ARM64 CPUType enum The old, incorrect one seems to have been added in `d41ac895bb`, with a similarly placed entry added in EnumTables.cpp in `eb4d6142dc`. This matches the value documented at https://docs.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cpu-type-e?view=vs-2019. This fixes running obj2yaml on an object file generated by MSVC. Differential Revision: https://reviews.llvm.org/D100306	2021-04-13 12:54:22 +03:00
Amy Huang	dad5caa59e	Revert "Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This change causes an assert / segmentation fault in LTO builds. This reverts commit `f2e4f3eff3`.	2021-04-12 20:10:17 -07:00
Freddy Ye	3fc1fe8db8	[X86] Support -march=rocketlake Reviewed By: skan, craig.topper, MaskRay Differential Revision: https://reviews.llvm.org/D100085	2021-04-13 09:48:13 +08:00
Gulfem Savrun Yeniceri	e96df3e531	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-04-13 01:29:41 +00:00
Arthur Eubanks	a8ab1f98d2	[Evaluator] Look through invariant.group intrinsics Turning on -fstrict-vtable-pointers in Chrome caused an extra global initializer. Turns out that a llvm.strip.invariant.group intrinsic was causing GlobalOpt to fail to step through some simple code. We can treat .invariant.group uses as simply their operand. Value::stripPointerCastsForAliasAnalysis() does exactly this. This should be safe because the Evaluator does not skip memory accesses due to invariants or alias analysis. However, we don't want to leak that we've stripped arbitrary pointer casts to users of Evaluator, so we bail out if we evaluate a function to any constant, since we may have looked through .invariant.group calls and aliasing pointers cannot be arbitrarily substituted. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D98843	2021-04-12 16:12:15 -07:00
Yuanfang Chen	c5fda0e662	Reland "Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom"" This reverts commit `a3fabc79ae` (relands `f4d682d6ce` with fix for the compile-time regression issue).	2021-04-12 14:50:54 -07:00
Nikita Popov	a3fabc79ae	Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom" This reverts commit `f4d682d6ce`. This caused a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=4b7bad9eaea2233521a94f6b096aaa88dc584e23&to=f4d682d6ce6c5b3a41a0acf297507c82f5c21eef&stat=instructions Possibly this is due to overeager parsing of target triples.	2021-04-12 22:55:59 +02:00
Hamza Sood	0a92aff721	Replace uses of std::iterator with explicit using This patch removes all uses of `std::iterator`, which was deprecated in C++17. While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library. For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D67586	2021-04-12 10:47:14 -07:00
Yuanfang Chen	f4d682d6ce	[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom D24453 enabled libcalls simplication for ARM PCS. This may cause caller/callee calling conventions mismatch in some situations such as LTO. This patch makes instcombine aware that the compatible calling conventions differences are benign (not emitting undef idom). Differential Revision: https://reviews.llvm.org/D99773	2021-04-12 09:32:23 -07:00
Stephen Tozer	f2e4f3eff3	Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" The causes of the previous build errors have been fixed in revisions `aa3e78a59f`, and `140757bfaa` This reverts commit `f40976bd01`.	2021-04-12 16:57:29 +01:00
Andrew Savonichev	f037b07b5c	Revert "[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant" This reverts commit `cca9b5985c`. Buildbot reported an error for CodeGen/AArch64/machine-combiner-fmul-dup.mir: * Bad machine code: Virtual register killed in block, but needed live out. * - function: indexed_2s - basic block: %bb.0 entry (0x640fee8) Virtual register %7 is used after the block. * Bad machine code: Virtual register defs don't dominate all uses. * - function: indexed_2s - v. register: %7 LLVM ERROR: Found 2 machine code errors.	2021-04-12 16:28:49 +03:00
Andrew Savonichev	cca9b5985c	[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant This patch adds DUP+FMUL => FMUL_indexed pattern to InstCombiner. FMUL_indexed is normally selected during instruction selection, but it does not work in cases when VDUP and VMUL are in different basic blocks. Differential Revision: https://reviews.llvm.org/D99662	2021-04-12 16:08:39 +03:00
Simon Pilgrim	199a21bd8c	[IR] Fix Wdocumentation warning. NFCI.	2021-04-12 11:20:57 +01:00
Zakk Chen	a8fc0e445c	[RISCV][Clang] Add all RVV Mask intrinsic functions. 1. Redefine vpopc and vfirst IR intrinsic so it could adapt on clang tablegen generator which always appends a type for vl in IntrinsicType of clang codegen. 2. Remove `c` type transformer and add `u` and `l` for unsigned long and long type. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100120	2021-04-11 19:19:02 -07:00
Roman Lebedev	9829f5e6b1	[CVP] @llvm.[us]{min,max}() intrinsics handling If we can tell that either one of the arguments is taken, bypass the intrinsic. Notably, we are indeed fine with non-strict predicate: * UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i * UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx * SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W * SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr Much like with all other comparison handling in CVP, while we could sort-of handle two Value's, at least for plain ICmpInst it does not appear to be worthwhile. This only fires 78 times on test-suite + dt + rs, but we don't canonicalize to these yet. (only SCEV produces them)	2021-04-11 00:33:47 +03:00
Wenlei He	00ef28ef21	[CSSPGO] Fix dangling context strings and improve profile order consistency and error handling This patch fixed the following issues along side with some refactoring: 1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer. 2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent. 3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile. 4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly. 5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch. Differential Revision: https://reviews.llvm.org/D100090	2021-04-10 12:39:10 -07:00
Roman Lebedev	257eda0794	[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue The default is likely wrong. Out of all the callees, only a single one needs to pass-in false (JumpThread), everything else either already passes true, or should pass true. Until the default is flipped, at least make it harder to unintentionally add new callees with UseBlockValue=false.	2021-04-10 20:46:01 +03:00
Roman Lebedev	03225969e3	[NFC] Rename LimitingIntrinsic into MinMaxIntrinsic As requested in post-commit review	2021-04-10 20:46:01 +03:00
Roman Lebedev	e8c7f43e2c	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:38:55 +03:00
Roman Lebedev	7b12c8c59d	Revert "[NFC][ConstantRange] Add 'icmp' helper method" This reverts commit `17cf2c9423`.	2021-04-10 19:37:53 +03:00
Roman Lebedev	8371dde485	Revert "zz" It wasn't meant to be committed, two commits should have been squashed. This reverts commit `0c18415496`.	2021-04-10 19:37:47 +03:00
Roman Lebedev	17cf2c9423	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:09:52 +03:00
Roman Lebedev	0c18415496	zz	2021-04-10 19:09:17 +03:00
dfukalov	8f4b7e94a2	[AMDGPU][CostModel] Refine cost model for control-flow instructions. Added cost estimation for switch instruction, updated costs of branches, fixed phi cost. Had to increase `-amdgpu-unroll-threshold-if` default value since conditional branch cost (size) was corrected to higher value. Test renamed to "control-flow.ll". Removed redundant code in `X86TTIImpl::getCFInstrCost()` and `PPCTTIImpl::getCFInstrCost()`. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D96805	2021-04-10 09:20:24 +03:00
Duncan P. N. Exon Smith	0db6488a77	Support: Add move semantics to mapped_file_region Update llvm::sys::fs::mapped_file_region to have a move constructor and a move assignment operator, allowing it to be used as an Optional. Also, update FileOutputBuffer's OnDiskBuffer to take advantage of this, avoiding an extra allocation from the unique_ptr. A nice follow-up would be to make the mapped_file_region constructor private and replace its use with a factory function, such as mapped_file_region::create(), that returns an Expected (or ErrorOr). I don't plan on doing that immediately, but I might swing back later. No functionality change, besides the saved allocation in OnDiskBuffer. Differential Revision: https://reviews.llvm.org/D100159	2021-04-09 17:56:26 -07:00
cchen	1a43fd2769	[OpenMP51] Initial support for masked directive and filter clause Adds basic parsing/sema/serialization support for the #pragma omp masked directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99995	2021-04-09 14:00:36 -05:00
Adrian Prantl	6ce76ff7eb	Update the linkage name of coro-split functions in the debug info. This patch updates the linkage name in the DISubprogram of coro-split functions, which is particularly important for Swift, where the funclets have a special name mangling. This patch does not affect C++ coroutines, since the DW_AT_specification is expected to hold the (original) linkage name. I believe this is mostly due to limitations in AsmPrinter, so we might be able to relax this restriction in the future. Differential Revision: https://reviews.llvm.org/D99693	2021-04-09 09:50:56 -07:00
dfukalov	c1a88e007b	[AA][NFC] Convert AliasResult to class containing offset for PartialAlias case. Add an ability to store `Offset` between partially aliased location. Use this storage within returned `ResultAlias` instead of caching it in `AAQueryInfo`. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98718	2021-04-09 13:26:09 +03:00
dfukalov	d066079728	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. Main reason is preparation to transform AliasResult to class that contains offset for PartialAlias case. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98027	2021-04-09 12:54:22 +03:00
Max Kazantsev	275f3a2540	[GVN][NFC] Factor out load elimination logic via PRE for reuse	2021-04-09 16:12:25 +07:00
Duncan P. N. Exon Smith	4a84b03ece	ADT: Sink the guts of StringMapEntry::Create into StringMapEntryBase Sink the interesting parts of StringMapEntry::Create into a new function StringMapEntryBase::allocateWithKey that's only templated on the allocator, taking the entry size and alignment as parameters. As dblaikie pointed out in the review, it'd be interesting as a follow-up to make this more generic and maybe sink at least some of it into a source file; I haven't done that yet myself, but I left behind an encouraging comment. Differential Revision: https://reviews.llvm.org/D95654	2021-04-08 17:57:47 -07:00
Duncan P. N. Exon Smith	6dc432510f	Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC, 3rd attempt This reverts commit `e35afbe535`, reapplying `022ccedde8` and `e7ed5c920d`. - The first attempt missed defining `SignpostEmitterImpl`. - The second attempt missed defining `llvm::SignpostEmitterImpl`. Not sure how I failed to test both versions locally before; I thought I'd turned the feature off via rerunning `cmake` but it must have been stuck in place. This time I confirmed via `clang -E` that I was testing both build configurations. Original commit message: Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 17:05:59 -07:00
Duncan P. N. Exon Smith	e35afbe535	Revert "Revert "Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC""" This reverts commit `e7ed5c920d` again, due to more buildbot failures: https://lab.llvm.org/buildbot/#/builders/131/builds/8191	2021-04-08 16:58:12 -07:00
Duncan P. N. Exon Smith	e7ed5c920d	Revert "Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC"" This reverts commit `078072285d`, reapplying `022ccedde8`. I figured out why this was failing in other environments: it's not a problem with std::unique_ptr, but that SignpostEmitterImpl only has a forward declaration. Adding an empty definition should do the trick. Original commit message: Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 16:50:39 -07:00
Duncan P. N. Exon Smith	078072285d	Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC" This reverts commit `022ccedde8`. Looks like some hosts need a definition of SignpostEmitterImpl to put it in a unique_ptr: https://lab.llvm.org/buildbot/#/builders/92/builds/7733	2021-04-08 16:38:47 -07:00
Duncan P. N. Exon Smith	022ccedde8	Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 16:31:59 -07:00
Duncan P. N. Exon Smith	429088b9e2	Support: Extract fs::resize_file_before_mapping_readwrite from FileOutputBuffer Add a variant of `fs::resize_file` for use immediately before opening a file with `mapped_file_region::readwrite`. On Windows, `_chsize` (`ftruncate`) is slow, but `CreateFileMapping` (`mmap`) automatically extends the file so the call to `fs::resize_file` can be skipped. This optimization was added to `FileOutputBuffer` in da9bc2e56d5a5c6332a9def1a0065eb399182b93; this commit just extracts the logic out and adds a unit test. Differential Revision: https://reviews.llvm.org/D95490	2021-04-08 16:26:35 -07:00
Arthur Eubanks	c5d1ccbcdf	[GVN] Properly invalidate ICF cache when we simplify a value This fixes a "Cached first special instruction is wrong!" assert. The assert fires because replacing a value with another can cause an instruction to no longer be "special" to ICF. In this case, devirtualization happened, turning an indirect call to a call to a willreturn function which is no longer special. Reviewed By: nikic, rnk Differential Revision: https://reviews.llvm.org/D99977	2021-04-08 14:01:57 -07:00
Paul C. Anagnostopoulos	3f919ff250	Revert "[TableGen] Add support for the 'assert' statement in multiclasses" This reverts commit `3b9a15d910`.	2021-04-08 13:58:58 -04:00
Paul C. Anagnostopoulos	3b9a15d910	[TableGen] Add support for the 'assert' statement in multiclasses	2021-04-08 08:36:03 -04:00
Stephen Tozer	140757bfaa	[DebugInfo] Prevent invalid debug info being produced during LoopStrengthReduce During LoopStrengthReduce, some of the SSA values that are used by debug values may be lost and/or salvaged. After LSR we attempt to recover any undef debug values, including any that were salvaged but then lost their values afterwards, by replacing the lost values with any live equal values (plus a possible constant offset) that have been gathered prior to running LSR. When we do this we restore the debug value's original DIExpression, to undo any salvaging (as we have gone back to using the original debug value). This process can currently produce invalid debug info if the number of operands has changed by salvaging during LSR. Replacing old values during the applyEqualValues step does not change the number of location operands, which means that when we restore the old DIExpression we may have a mismatch between the number of operands used by the debug value and the number of operands referenced by the DIExpression. This patch fixes this by restoring the full original location metadata at the start of the applyEqualValues step, so that there is no mismatch in operand count between the debug value and its DIExpression. Differential Revision: https://reviews.llvm.org/D98644	2021-04-08 13:04:48 +01:00
Arthur Eubanks	90af134473	Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()" This reverts commit `9583a3f262`. This wasn't NFC as initially thought. Needed for D99707.	2021-04-07 11:40:44 -07:00
LemonBoy	03f7b13d44	[X86] Initialize TargetOptions::StackProtectorGuardOffset member to its default value D88631 introduced a set of knobs to tweak how the stack protector is codegen'd for x86 targets, including the offset from the base register where the stack cookie is located. The `StackProtectorGuardOffset` field in `TargetOptions` was left uninitialized instead of being reset to its neutral value -1, making it possible to emit nonsensical code if the frontend doesn't change the field value at all before feeding the `TargetOptions` to the target machine initializer. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D99952	2021-04-07 09:04:25 +02:00
Jonas Devlieghere	5d07dc8977	[dsymutil] Don't emit .debug_pubnames and .debug_pubtypes Consider the .debug_pubnames and .debug_pubtypes their own kind of accelerator and stop emitting them together with the Apple-style accelerator tables. The only reason we were still emitting both was for (byte-for-byte) compatibility with dsymutil-classic. - This patch adds a new accelerator table kind "Pub" which can be specified with --accelerator=Pub. - This patch removes the ability to emit both pubnames/types and apple style accelerator tables. I don't think anyone is relying on that but it's worth pointing out. - This patch removes the --minimize option and makes this behavior the default. Specifying the flag will result in a warning but won't abort the program. Differential revision: https://reviews.llvm.org/D99907	2021-04-06 19:01:45 -07:00
Nicolás Alvarez	a1aada75f5	[docs] Fix doxygen comments wrongly attached to the llvm namespace Looking at the Doxygen-generated documentation for the llvm namespace currently shows all sorts of random comments from different parts of the codebase. These are mostly caused by: - File doc comments that aren't marked with \file, so they're attached to the next declaration, which is usually "namespace llvm {". - Class doc comments placed before the namespace rather than before the class. - Code comments before the namespace that (in my opinion) shouldn't be extracted by doxygen at all. This commit fixes these comments. The generated doxygen documentation now has proper docs for several classes and files, and the docs for the llvm and llvm::detail namespaces are now empty. Reviewed By: thakis, mizvekov Differential Revision: https://reviews.llvm.org/D96736	2021-04-07 01:20:18 +02:00
Craig Topper	0d6dd23ca9	[MachineValueTypes] Add blank lines between floating point vectors with different element types. NFC	2021-04-06 14:51:56 -07:00
Sidharth Baveja	d81d9e8b86	[SplitEdge] Update SplitCriticalEdge to return a nullptr only when the edge is not critical Summary: The function SplitCriticalEdge (called by SplitEdge) can return a nullptr in cases where the edge is a critical. SplitEdge uses SplitCriticalEdge assuming it can always split all critical edges, which is an incorrect assumption. The three cases where the function SplitCriticalEdge will return a nullptr is: 1. DestBB is an exception block 2. Options.IgnoreUnreachableDests is set to true and isa(DestBB->getFirstNonPHIOrDbgOrLifetime()) is not equal to a nullptr 3. LoopSimplify form must be preserved (Options.PreserveLoopSimplify is true) and it cannot be maintained for a loop due to indirect branches For each of these situations they are handled in the following way: 1. Modified the function ehAwareSplitEdge originally from llvm/lib/Transforms/Coroutines/CoroFrame.cpp to handle the cases when the DestBB is an exception block. This function is called directly in SplitEdge. SplitEdge does not call SplitCriticalEdge in this case 2. Options.IgnoreUnreachableDests is set to false by default, so this situation does not apply. 3. Return a nullptr in this situation since the SplitCriticalEdge also returned nullptr. Nothing we can do in this case. Reviewed By: asbirlea Differential Revision:https://reviews.llvm.org/D94619	2021-04-06 21:24:40 +00:00
Philip Reames	9ef6aa020b	Plumb AssumeInst through operand bundle apis [nfc] Follow up to `a6d2a8d6f5`. This covers all the public interfaces of the bundle related code. I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.	2021-04-06 12:53:53 -07:00
Philip Reames	a6d2a8d6f5	Add a subclass of IntrinsicInst for llvm.assume [nfc] Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.	2021-04-06 11:16:22 -07:00
Philip Reames	21d4839948	Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc] These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.	2021-04-06 08:33:15 -07:00
Kerry McLaughlin	7344f3d39a	[LoopVectorize] Add strict in-order reduction support for fixed-width vectorization Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to reorder FP operations. However, it may still be beneficial to vectorize the loop by moving the reduction inside the vectorized loop and making sure that the scalar reduction value be an input to the horizontal reduction, e.g: %phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ] %load = load <8 x float> %reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load) This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions. For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D98435	2021-04-06 14:45:34 +01:00
Abhina Sreeskantharajan	82b3e28e83	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text Problem: On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable. Solution: This patch adds two new flags - OF_CRLF which indicates that CRLF translation is used. - OF_TextWithCRLF = OF_Text \| OF_CRLF indicates that the file is text and uses CRLF translation. Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF. So this is the behaviour per platform with my patch: z/OS: OF_None: open in binary mode OF_Text : open in text mode OF_TextWithCRLF: open in text mode Windows: OF_None: open file with no carriage return OF_Text: open file with no carriage return OF_TextWithCRLF: open file with carriage return The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set. ``` if (Flags & OF_CRLF) CrtOpenFlags \|= _O_TEXT; ``` These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows. ./llvm/lib/Support/raw_ostream.cpp ./llvm/lib/TableGen/Main.cpp ./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp ./llvm/unittests/Support/Path.cpp ./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp ./clang/lib/Frontend/CompilerInstance.cpp ./clang/lib/Driver/Driver.cpp ./clang/lib/Driver/ToolChains/Clang.cpp Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99426	2021-04-06 07:23:31 -04:00
Kerry McLaughlin	857b8a73da	[LoopVectorize] Change the identity element for FAdd Changes getRecurrenceIdentity to always return a neutral value of -0.0 for FAdd. Reviewed By: dmgreen, spatel Differential Revision: https://reviews.llvm.org/D98963	2021-04-06 12:13:43 +01:00
Simon Pilgrim	ddbb58736a	[KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI. As promised in D98866	2021-04-06 10:11:41 +01:00
Yevgeny Rouban	39e3e3aa51	[NewPM] Redesign of PreserveCFG Checker The reason for the NewPM redesign is described in the commit cba3e783389a: [NewPM] Disable PreservedCFGChecker ... The checker introduces an internal custom CFG analysis that tracks current up-to date CFG snapshot. The analysis is invalidated along any other CFG related analysis (the key is CFGAnalyses). If the CFG analysis is not invalidated at a functional pass exit then the checker asserts that the CFG snapshot taken from this analysis is equals to a snapshot of the current CFG. Along the way: - the function CFG::printDiff() is simplified by removing function name calculation. The name is printed by the caller; - fixed CFG invalidated condition (see CFG::invalidate()); - StandardInstrumentations::registerCallbacks() gets additional optional parameter of type FunctionAnalysisManager*, which is needed by the checker to get the custom CFG analysis; - several PM related tests updated to explicitly set -verify-cfg-preserved=1 as they need. This patch is safe to land as the CFGChecker is left switched off (the options -verify-cfg-preserved is false by default). It will be switched on by a separate patch to minimize possible reverts. Reviewed By: skatkov, kuhar Differential Revision: https://reviews.llvm.org/D91327	2021-04-06 12:35:49 +07:00
Serguei Katkov	0057ec8034	[Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D99875	2021-04-06 11:44:37 +07:00
Arthur Eubanks	ea0e2ca1ac	[SROA] Allow SROA on pointers with invariant group intrinsic uses When we are able to SROA an alloca, we know all uses of it, meaning we don't have to preserve the invariant group intrinsics and metadata. It's possible that we could lose information regarding redundant loads/stores, but that's unlikely to have any real impact since right now the only user is Clang and vtables. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99760	2021-04-05 19:53:40 -07:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Ricky Taylor	4db18d62af	[M68k] Add support for Motorola literal syntax to AsmParser These look like $00A0cf for hex and %001010101 for binary. They are used in Motorola assembly syntax. Differential Revision: https://reviews.llvm.org/D98519	2021-04-05 20:02:29 +01:00
Jennifer Yu	7078ef4722	[OPENMP51]Initial support for nocontext clause. Added basic parsing/sema/serialization support for the 'nocontext' clause. Differential Revision: https://reviews.llvm.org/D99848	2021-04-05 11:45:49 -07:00
Cyndy Ishida	0116d04d04	[TextAPI] move source code files out of subdirectory, NFC TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a need to seperate out TextAPI between formats. Reviewed By: ributzka, int3, #lld-macho Differential Revision: https://reviews.llvm.org/D99811	2021-04-05 10:24:42 -07:00
Alexey Bataev	00a84f9a7f	[SLP]Improve vectorization of the CmpInst instructions. During vectorization better to postpone the vectorization of the CmpInst instructions till the end of the basic block. Otherwise we may vectorize it too early and may miss some vectorization patterns, like reductions. Reworked part of D57059 Differential Revision: https://reviews.llvm.org/D99796	2021-04-05 06:22:51 -07:00
Alex Orlov	5f57793c4f	* NFC. Refactored DIPrinter for better support of new print styles. This patch introduces a DIPrinter interface to implement by different output style printer implementations. DIPrinterGNU and DIPrinterLLVM implement the GNU and LLVM output style printing respectively. No functional changes. This refactoring clarifies and simplifies the code, and makes a new output style addition easier. Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D98994	2021-04-05 15:40:41 +04:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Nikita Popov	9d20eaf9c0	[BasicAA] Don't store AATags in cache key (NFC) The AAMDNodes part of the MemoryLocation is not used by the BasicAA cache, so don't store it. This reduces the size of each cache entry from 112 bytes to 48 bytes.	2021-04-03 11:32:01 +02:00
Nikita Popov	17b4e5d456	[BasicAA] Don't pass through AA metadata (NFCI) BasicAA itself doesn't make use of AA metadata, but passes it through to recursive queries and makes it part of the cache key. Aliasing decisions that are based on AA metadata (i.e. TBAA and ScopedAA) are based only on AA metadata, so checking them with different pointer values or sizes is not useful, the result will always be the same. While this change is a mild compile-time improvement by itself, the actual goal here is to reduce the size of AA cache keys in a followup change. Differential Revision: https://reviews.llvm.org/D90098	2021-04-03 11:21:50 +02:00
Simon Pilgrim	4ea5475a3f	[KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI. Include exhaustive test coverage.	2021-04-02 21:44:33 +01:00
Jennifer Yu	cb424fee3d	[OPENMP5.1]Initial support for novariants clause. Added basic parsing/sema/serialization support for the 'novariants' clause.	2021-04-02 13:19:01 -07:00
Jan Kratochvil	942cf22565	[nfc] [llvm] Make DWARFListTableBase::findList const Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99731	2021-04-02 21:41:47 +02:00
Levy Hsu	f78d932cf2	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Cyndy Ishida	3a223cd4f3	[TextAPI] run clang-format on violating sections, NFC	2021-04-02 11:44:33 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Brendon Cahoon	09a88278cb	[GlobalISel] Allow different types for G_SBFX and G_UBFX operands Change the definition of G_SBFX and G_UBFX so that the lsb and width can have different types than the src and dst operands. Differential Revision: https://reviews.llvm.org/D99739	2021-04-02 11:11:06 -04:00
Sander de Smalen	0f7bbbc481	Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed. In order to bring up scalable vector support in LLVM incrementally, we introduced behaviour to emit a warning, instead of an error, when asking the wrong question of a scalable vector, like asking for the fixed number of elements. This patch puts that behaviour under a flag. The default behaviour is that the compiler will always error, which means that all LLVM unit tests and regression tests will now fail when a code-path is taken that still uses the wrong interface. The behaviour to demote an error to a warning can be individually enabled for tools that want to support experimental use of scalable vectors. This patch enables that behaviour when driving compilation from Clang. This means that for users who want to try out scalable-vector support, fixed-width codegen support, or build user-code with scalable vector intrinsics, Clang will not crash and burn when the compiler encounters such a case. This allows us to do away with the following pattern in many of the SVE tests: RUN: .... 2>%t RUN: cat %t \| FileCheck --check-prefix=WARN WARN-NOT: warning: ... The behaviour to emit warnings is only temporary and we expect this flag to be removed in the future when scalable vector support is more stable. This patch also has fixes the following tests: unittests: ScalableVectorMVTsTest.SizeQueries SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG regression tests: Transforms/InstCombine/vscale_gep.ll Reviewed By: paulwalker-arm, ctetreau Differential Revision: https://reviews.llvm.org/D98856	2021-04-02 10:55:22 +01:00
Evgeniy Brevnov	2388aae401	[NARY-REASSOCIATE] Support reassociation of min/max Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available. Reviewed By: mkazantsev, lebedev.ri Differential Revision: https://reviews.llvm.org/D88287	2021-04-02 15:30:13 +07:00
Daniel Rodríguez Troitiño	f5c9db97a8	[TextAPI] Add support for arm64_32 Add a new architecture definition for arm64_32. The change should allow the new architecture arm64_32 to be recognized in several pieces of code, TextAPI parsing one of them. llvm-lipo will also recognize the architecture and will allow lipoing files with this architecture without failing. Includes a small test that the architecture is recognized by llvm-nm. Reviewed By: cishida Differential Revision: https://reviews.llvm.org/D99673	2021-04-01 17:19:12 -07:00
cchen	cba422264c	[OpenMP51] Accept `primary` as proc bind affinity policy in Clang Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99622	2021-04-01 18:07:12 -05:00
Philip Reames	ffa15e9463	Extract isVolatile helper on Instruction [NFCI] We have this logic duplicated in several cases, none of which were exhaustive. Consolidate it in one place. I don't believe this actually impacts behavior of the callers. I think they all filter their inputs such that their partial implementations were correct. If not, this might be fixing a cornercase bug.	2021-04-01 11:24:02 -07:00
Philip Reames	6b05d753e0	Mark unordered memset/memmove/memcpy as nosync Mostly a means to remove a bit of code from attributor in advance of implementing a FuncAttr inference for nosync.	2021-04-01 10:38:54 -07:00
Philip Reames	e2c6621e63	[deref-at-point] restrict inference of dereferenceability based on allocsize attribute Support deriving dereferenceability facts from allocation sites with known object sizes while correctly accounting for any possibly frees between allocation and use site. (At the moment, we're conservative and only allowing it in functions where we know we can't free.) This is part of the work on deref-at-point semantics. I'm making the change unconditional as the miscompile in this case is way too easy to trip by accident, and the optimization was only recently added (by me). There will be a follow up patch wiring through TLI since that should now be doable without introducing widespread miscompiles. Differential Revision: https://reviews.llvm.org/D95815	2021-04-01 08:34:40 -07:00
Mircea Trofin	ce61def529	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-04-01 08:33:28 -07:00
Anirudh Prasad	7b921a6747	[AsmParser][SystemZ][z/OS] Add in support to accept "#" as part of an Identifier token - This patch adds in support to accept the "#" character as part of an Identifier. - This support is needed especially for the HLASM dialect since "#" is treated as part of the valid "Alphabet" range - The way this is done is by making use of the previous precedent set by the `AllowAtInIdentifier` field in `MCAsmLexer.h`. A new field called `AllowHashInIdentifier` is introduced. - The static function `IsIdentifierChar` is also updated to accept the `#` character if the `AllowHashInIdentifier` field is set to true. Note: The field introduced in `MCAsmLexer.h` could very well be moved to `MCAsmInfo.h`. I'm not opposed to it. I decided to put it in `MCAsmLexer` since there seems to be some sort of precedent already with `AllowAtInIdentifier`. Reviewed By: abhina.sreeskantharajan, nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D99277	2021-04-01 11:24:43 -04:00
Max Kazantsev	630818a850	[NFC] Disambiguate LI in GVN Name GVN uses name 'LI' for two different unrelated things: LoadInst and LoopInfo. This patch relates the variables with former meaning into 'Load' to disambiguate the code.	2021-04-01 12:40:35 +07:00
Chen Zheng	bfcd21876a	[debug-info] support new tuning debugger type DBX for XCOFF DWARF Based on this debugger type, for now, we plan to: 1: use inline string by default for XCOFF DWARF 2: generate no column info for debug line table. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99400	2021-04-01 00:11:30 -04:00
Philip Reames	115a42ad1e	Add debug printers for KnownBits [nfc]	2021-03-31 15:36:07 -07:00
Thomas Lively	45783d0e8a	[WebAssembly] Implement i64x2 comparisons Removes the prototype builtin and intrinsic for i64x2.eq and implements that instruction as well as the other i64x2 comparison instructions in the final SIMD spec. Unsigned comparisons were not included in the final spec, so they still need to be scalarized via a custom lowering. Differential Revision: https://reviews.llvm.org/D99623	2021-03-31 10:46:17 -07:00
Wael Yehia	563cdeaafd	[LTO][Legacy] Decouple option parsing from LTOCodeGenerator in this patch we add a new libLTO API to specify debug options independent of an lto_code_gen_t. This allows clients to pass codegen flags (through libLTO) which otherwise today are ignored. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D92611	2021-03-31 16:43:26 +00:00
Sander de Smalen	b6d0529780	[CostModel] Align the cost model for intrinsics for scalable/fixed-width vectors. Let getIntrinsicInstrCost call getTypeBasedIntrinsicInstrCost for scalable vectors, similar to how this is done for fixed-width vectors, instead of falling back on BaseT::getIntrinsicInstrCost(). If the intrinsic cannot be costed (or is not overloaded by the target), it will return InstructionCost::getInvalid() instead. Depends on D97469 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D97470	2021-03-31 14:52:49 +01:00
Sander de Smalen	2f6f249a49	NFC: Change getIntrinsicInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97468 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97469	2021-03-31 14:04:41 +01:00
Sander de Smalen	2f56e1c6b1	NFC: Change getTypeBasedIntrinsicCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97466 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97468	2021-03-31 14:04:41 +01:00
Sander de Smalen	3ccbd4f3c7	NFC: Change getUserCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97382 Reviewed By: ctetreau, paulwalker-arm Differential Revision: https://reviews.llvm.org/D97466	2021-03-31 10:13:09 +01:00
Lang Hames	ec235dd355	[JITLink] Delete copy and move constructors for jitlink::Section. Sections are not movable or copyable.	2021-03-30 22:58:14 -07:00
Lang Hames	0269a407f3	[JITLink] Switch from StringRef to ArrayRef<char>, add some generic x86-64 utils Adds utilities for creating anonymous pointers and jump stubs to x86_64.h. These are used by the GOT and Stubs builder, but may also be used by pass writers who want to create pointer stubs for indirection. This patch also switches the underlying type for LinkGraph content from StringRef to ArrayRef<char>. This avoids any confusion when working with buffers that contain null bytes in the middle like, for example, a newly added null pointer content array. ;)	2021-03-30 21:07:24 -07:00
Chuanqi Xu	eb51dd719f	[Coroutine] [Debug] Insert dbg.declare to entry.resume to print alloca in the coroutine frame under O2 Summary: Try to insert dbg.declare to entry.resume basic block in resume function. In this way, we could print alloca such as __promise in gdb/lldb under O2, which would be beneficial to debug coroutine program. Test Plan: check-llvm Reviewed by: aprantl Differential Revision: https://reviews.llvm.org/D96938	2021-03-31 10:37:06 +08:00
Lang Hames	3a83b8b2d2	[JITLink] Add a setProtectionFlags method to jitlink::Section. This allows clients to modify the memory protection settings on sections via jitlink passes. This can be used to, for example, override the default settings on text pages and make them Read/Write/Executable under the JIT.	2021-03-30 17:56:29 -07:00
Craig Topper	f59ba0849f	[StructLayout] Use TrailingObjects to allocate space for MemberOffsets. MemberOffsets are stored at the end of StructLayout. The class contains a single entry array to mark the start of the member offsets. getStructLayout calculates the additional space needed for additional elements before allocating memory. This patch converts this to use TrailingObjects. This simplifies the size computation in getStructLayout and gets rid of the single entry array. This is prep work, but to use TypeSize instead of uint64_t for D98169. The single entry array doesn't work with TypeSize because TypeSize doesn't have a default constructor. We thought this change was an improvement by itself so we've separated it out. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D99608	2021-03-30 17:36:50 -07:00
Wei Mi	d535a05ca1	[ThinLTO] During module importing, close one source module before open another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554	2021-03-30 14:37:29 -07:00
Mike Rice	b7899ba0e8	[OPENMP51]Initial support for the dispatch directive. Added basic parsing/sema/serialization support for dispatch directive. Differential Revision: https://reviews.llvm.org/D99537	2021-03-30 14:12:53 -07:00
Amara Emerson	a35c2c7942	[GlobalISel] Implement fewerElements legalization for vector reductions. This patch adds 3 methods, one for power-of-2 vectors which use tree reductions using vector ops, before a final reduction op. For non-pow-2 types it generates multiple narrow reductions and combines the values with scalar ops. Differential Revision: https://reviews.llvm.org/D97163	2021-03-30 11:19:21 -07:00
Amara Emerson	91887cd4ec	[AArch64][GlobalISel] Combine funnel shifts to rotates. Differential Revision: https://reviews.llvm.org/D99388	2021-03-30 11:00:36 -07:00
Sourabh Singh Tomar	f13f050551	[DebugInfo] Support for signed constants inside DIExpression Negative numbers are represented using DW_OP_consts along with signed representation of the number as the argument. Test case IR is generated using Fortran front-end. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99273	2021-03-30 23:20:38 +05:30
Hongtao Yu	3e3fc431df	[CSSPGO] Top-down processing order based on full profile. Use profiled call edges to augment the top-down order. There are cases that the top-down order computed based on the static call graph doesn't reflect real execution order. For example: 1. Incomplete static call graph due to unknown indirect call targets. Adjusting the order by considering indirect call edges from the profile can enable the inlining of indirect call targets by allowing the caller processed before them. 2. Mutual call edges in an SCC. The static processing order computed for an SCC may not reflect the call contexts in the context-sensitive profile, thus may cause potential inlining to be overlooked. The function order in one SCC is being adjusted to a top-down order based on the profile to favor more inlining. 3. Transitive indirect call edges due to inlining. When a callee function is inlined into into a caller function in LTO prelink, every call edge originated from the callee will be transferred to the caller. If any of the transferred edges is indirect, the original profiled indirect edge, even if considered, would not enforce a top-down order from the caller to the potential indirect call target in LTO postlink since the inlined callee is gone from the static call graph. 4. #3 can happen even for direct call targets, due to functions defined in header files. Header functions, when included into source files, are defined multiple times but only one definition survives due to ODR. Therefore, the LTO prelink inlining done on those dropped definitions can be useless based on a local file scope. More importantly, the inlinee, once fully inlined to a to-be-dropped inliner, will have no profile to consume when its outlined version is compiled. This can lead to a profile-less prelink compilation for the outlined version of the inlinee function which may be called from external modules. while this isn't easy to fix, we rely on the postlink AutoFDO pipeline to optimize the inlinee. Since the survived copy of the inliner (defined in headers) can be inlined in its local scope in prelink, it may not exist in the merged IR in postlink, and we'll need the profiled call edges to enforce a top-down order for the rest of the functions. Considering those cases, a profiled call graph completely independent of the static call graph is constructed based on profile data, where function objects are not even needed to handle case #3 and case 4. I'm seeing an average 0.4% perf win out of SPEC2017. For certain benchmark such as Xalanbmk and GCC, the win is bigger, above 2%. The change is an enhancement to https://reviews.llvm.org/D95988. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D99351	2021-03-30 10:42:22 -07:00
Jessica Paquette	700431128e	[GlobalISel][AArch64] Combine G_SEXT_INREG + right shift -> G_SBFX Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG. This is only done post-legalization for now. Once the legalizer knows how to decompose these back into shifts, this requirement can probably be removed. Differential Revision: https://reviews.llvm.org/D99230	2021-03-30 10:14:30 -07:00
Amara Emerson	f5e9be6fdb	[GlobalISel] Implement lowering for G_ROTR and G_ROTL. This is a straightforward port. Differential Revision: https://reviews.llvm.org/D99449	2021-03-30 09:44:41 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Craig Topper	292816d2b6	[RISCV] Don't set the SplatOperand flag on intrinsics that take a shift amount. The shift amount should always be a vector or an XLen scalar. The SplatOperand flag is used to indicate we need to legalize non-XLen scalars including special handling for i64 on RV32. This will prevent us from silently adjusting these operands if the intrinsics are misused. I'll probably adjust the name of the SplatOperand flag slightly in a follow up patch. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D99545	2021-03-30 09:23:36 -07:00
Krasimir Georgiev	c51e91e046	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `5178ffc7cf`. Compiling `llvm-profdata` with a compiler build from this produces a crashing binary.	2021-03-30 14:13:37 +02:00
Sander de Smalen	f71ed5dfe2	NFC: Migrate PartialInlining to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D97382	2021-03-30 11:59:45 +01:00
Sander de Smalen	4ca860742d	[InstructionCost] Don't conflate Invalid costs with Unknown costs. We previously made a change to getUserCost to return a Invalid cost when one of the TTI costs returned '-1' (meaning 'unknown' or 'infinitely expensive'). It makes no sense to say that: shufflevector <2 x i8> %x, <2 x i8> %y, <4 x i32> <i32 0, i32 1, i32 2, i32 3> has an invalid cost. Perhaps the cost is not known, but the IR is valid and can be code-generated. Invalid should only be used for IR that cannot possibly be code-generated and where a cost is nonsensical. With more passes now asserting that the cost must be valid, it is possible that those assertions will fail for perfectly valid IR. An incomplete cost-model probably shouldn't be a reason for the compiler to break. It's better to consider these costs as 'very expensive' and ignore them for other reasons. At some point, we should consider replacing -1 with some other mechanism. Reviewed By: paulwalker-arm, dmgreen Differential Revision: https://reviews.llvm.org/D99502	2021-03-30 09:29:42 +01:00
Stefan Gränitz	c352a2b829	[lli] Add option -lljit-platform=Inactive to disable platform support explicitly This option tells LLJIT to disable platform support explicitly: JITDylibs aren't scanned for special init/deinit symbols and no runtime API interposes are injected. It's useful in two cases: for platforms that don't have such requirements and platforms for which we have no explicit support yet and that don't work well with the generic IR platform. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D99416	2021-03-30 09:29:45 +02:00
Alok Kumar Sharma	9fb0025f70	[DebugInfo] Upgrade DISubragne::count to accept DIExpression also This is needed for Fortran assumed shape arrays whose dimensions are defined as, - 'count' is taken from array descriptor passed as parameter by caller, access from descriptor is defined by type DIExpression. - 'lowerBound' is defined by callee. The current alternate way represents using upperBound in place of count, where upperBound is calculated in callee in a temp variable using lowerBound and count Representation with count (DIExpression) is not only clearer as compared to upperBound (DIVariable) but it has another advantage that variable count is accessed by being parameter has better chance of survival at higher optimization level than upperBound being local variable. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99335	2021-03-30 09:16:55 +05:30
Adrian Prantl	8573c28a51	Add debug support for set types This commit adds debugging support for set types defined in languages such as Pascal and Modula-2. Patch by Peter McKinna! Differential Revision: https://reviews.llvm.org/D76115	2021-03-29 18:04:48 -07:00
Huihui Zhang	ca721042f1	[IPO][SampleContextTracker] Use SmallVector to track context profiles to prevent non-determinism. Use SmallVector instead of SmallSet to track the context profiles mapped. Doing this can help avoid non-determinism caused by iterating over unordered containers. This bug was found with reverse iteration turning on, --extra-llvm-cmake-variables="-DLLVM_REVERSE_ITERATION=ON". Failing LLVM test profile-context-tracker-debug.ll . Reviewed By: MaskRay, wenlei Differential Revision: https://reviews.llvm.org/D99547	2021-03-29 16:37:10 -07:00
Gulfem Savrun Yeniceri	5178ffc7cf	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-29 21:53:32 +00:00
Nico Weber	f53dc06ed3	fix comment typo to cycle bots	2021-03-29 15:47:16 -04:00
Wenlei He	30b0232336	[CSSPGO][llvm-profgen] Context-sensitive global pre-inliner This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen. It will serve two purposes: 1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size. 2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation. Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler. This change only has the framework, with a few TODOs left for follow up patches for a complete implementation: 1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation. 2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR. Differential Revision: https://reviews.llvm.org/D99146	2021-03-29 09:46:14 -07:00
Wei Mi	3cbf44190b	[SampleFDO] Do not scale the magic number NOMORE_ICP_MAGICNUM in value profile during profile update. When we inline a function and update the profile, the value profiles of the indirect call in the inliner and inlinee will be scaled. In https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we start using the magic number NOMORE_ICP_MAGICNUM (-1) to mark targets which have been promoted. The magic number shouldn't be scaled during the profile update. Although the problem has been suppressed by https://reviews.llvm.org/D98187 for SampleFDO, which stops profile update for inlining in sampleFDO, the patch is still wanted since it will be more consistent to handle the magic number properly in profile update. Differential Revision: https://reviews.llvm.org/D99394	2021-03-29 09:34:37 -07:00
Florian Hahn	c773d0f973	Recommit "[LV] Move runtime pointer size check to LVP::plan()." Re-apply `25fbe803d4`, with a small update to emit the right remark class. Original message: [LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri	2021-03-29 16:14:27 +01:00
Bradley Smith	9745dce8c3	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Florian Hahn	485c8ce733	Revert "[LV] Move runtime pointer size check to LVP::plan()." This reverts commit `25fbe803d4`. This breaks a clang test which filters for the wrong remark type.	2021-03-29 14:41:53 +01:00
Paul C. Anagnostopoulos	5f473a04af	[TableGen] Add support for the 'assert' statement in class definitions. Differential Revision: https://reviews.llvm.org/D99275	2021-03-29 09:20:29 -04:00
Florian Hahn	25fbe803d4	[LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98634	2021-03-29 14:12:29 +01:00
Matt Arsenault	9a0c9402fa	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `07e46367ba`.	2021-03-29 08:55:30 -04:00
Jingu Kang	e4abb64100	[LoopUnswitch] Use reference variables instead of pointer one Differential Revision: https://reviews.llvm.org/D99496	2021-03-29 13:08:46 +01:00
Oliver Stannard	07e46367ba	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit `fc9df30991`.	2021-03-29 11:32:22 +01:00
Jingu Kang	cfe87d4edd	[NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils Differential revision: https://reviews.llvm.org/D99490	2021-03-29 10:29:45 +01:00
Lang Hames	666df2e2cb	[ORC][C-bindings] Fix some ORC C bindings function names and signatures. LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not have matching signatures between the C-API header and binding implementations. Fixes http://llvm.org/PR49745. Patch by Mats Larsen. Thanks Mats! Reviewed by: lhames Differential Revision: https://reviews.llvm.org/D99478	2021-03-28 16:30:47 -07:00
Matt Arsenault	fc9df30991	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `20d5c42e0e`.	2021-03-28 13:35:21 -04:00
Nico Weber	20d5c42e0e	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `4fefed6563`. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Matt Arsenault	4fefed6563	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Florian Hahn	eb3d9f2eb6	[SelDag] Add isIntOrFPConstant helper function. This patch adds a new isIntOrFPConstant helper function to check if a SDValue is a integer of FP constant. This pattern is used in various places. There also are places that incorrectly just check for integer constants, e.g. D99384, so hopefully this helper will help people avoid that issue. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D99428	2021-03-28 12:48:58 +01:00
Nikita Popov	9075864b73	[BasicAA] Refactor linear expression decomposition The current linear expression decomposition handles zext/sext by decomposing the casted operand, and then checking NUW/NSW flags to determine whether the extension can be distributed. This has some disadvantages: First, it is not possible to perform a partial decomposition. If we have zext((x + C1) +<nuw> C2) then we will fail to decompose the expression entirely, even though it would be safe and profitable to decompose it to zext(x + C1) +<nuw> zext(C2) Second, we may end up performing unnecessary decompositions, which will later be discarded because they lack nowrap flags necessary for extensions. Third, correctness of the code is not entirely obvious: At a high level, we encounter zext(x -<nuw> C) in the form of a zext on the linear expression x + (-C) with nuw flag set. Notably, this case must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C). The code handles this correctly by speculatively zexting constants to the final bitwidth, and performing additional fixup if the actual extension turns out to be an sext. This was not immediately obvious to me. This patch inverts the approach: An ExtendedValue represents a zext(sext(V)), and linear expression decomposition will try to decompose V further, either by absorbing another sext/zext into the ExtendedValue, or by distributing zext(sext(x op C)) over a binary operator with appropriate nsw/nuw flags. At each step we can determine whether distribution is legal and abort with a partial decomposition if not. We also know which extensions we need to apply to constants, and don't need to speculate or fixup.	2021-03-27 23:31:58 +01:00
Juneyoung Lee	05884d3b52	Make FoldBranchToCommonDest poison-safe by default This is a small patch to make FoldBranchToCommonDest poison-safe by default. After `fc3f0c9c`, only two syntactic changes are needed to fix unit tests. This does not cause any assembly difference in testsuite as well (-O3, X86-64 Manjaro). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99452	2021-03-27 19:05:12 +09:00
Amara Emerson	55533203d7	[GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates. Differential Revision: https://reviews.llvm.org/D99383	2021-03-25 17:23:30 -07:00
Jessica Paquette	23f657c165	[AArch64][GlobalISel] Emit bzero on Darwin Darwin platforms for both AArch64 and X86 can provide optimized `bzero()` routines. In this case, it may be preferable to use `bzero` in place of a memset of 0. This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can be generated by platforms which may want to use bzero. To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The conditions for this are largely a port of the bzero case in `AArch64SelectionDAGInfo::EmitTargetCodeForMemset`. The only difference in comparison to the SelectionDAG code is that, when compiling for minsize, this will fire for all memsets of 0. The original code notes that it's not beneficial to do this for small memsets; however, using bzero here will save a mov from wzr. For minsize, I think that it's preferable to prioritise omitting the mov. This also fixes a bug in the libcall legalization code which would delete instructions which could not be legalized. It also adds a check to make sure that we actually get a libcall name. Code size improvements (Darwin): - CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign) - CTMark -Oz: -0.2% geomean (-0.5% on bullet) Differential Revision: https://reviews.llvm.org/D99358	2021-03-25 17:14:25 -07:00
Fangrui Song	ed956554f9	[Triple][Driver] Add muslx32 environment and use /lib/ld-musl-x32.so.1 for -dynamic-linker Differential Revision: https://reviews.llvm.org/D99308	2021-03-25 16:25:47 -07:00
Guozhi Wei	3240910f00	[DAE] Adjust param/arg attributes when changing parameter to undef In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code. To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg. Differential Revision: https://reviews.llvm.org/D98899	2021-03-25 14:53:22 -07:00
Philip Reames	4f5e92cc05	Mark gc.relocate and gc.result as readnone (try 2) As noted in the LangRef, these are semantically readnone projections from the result value of the associated statepoint. However, it turned out we had a few latent bugs being covered up by the fact we were only marking them readonly (see PR49607 for context). As of this change, all known issues are resolved. This is a deliberately minimal patch to make it easy to test downstream and revert with minimal change if that turns out to be necessary. Differential Revision: https://reviews.llvm.org/D98729	2021-03-25 14:50:07 -07:00
Andrew Savonichev	bba25a9cd8	[MCA] Support carry-over instructions for in-order processors Instructions that have more uops than the processor's IssueWidth are issued in multiple cycles. The patch fixes PR49712. Differential Revision: https://reviews.llvm.org/D99339	2021-03-26 00:06:19 +03:00
Nikita Popov	93a636d9f6	[IR] Lift attribute handling for assume bundles into CallBase Rather than special-casing assume in BasicAA getModRefBehavior(), do this one level higher, in the attribute handling of CallBase. For assumes with operand bundles, the inaccessiblememonly attribute applies regardless of operand bundles.	2021-03-25 21:15:39 +01:00
Mircea Trofin	20ad206b60	[NFC] Module::getInstructionCount() is const	2021-03-25 12:29:19 -07:00
Stanislav Mekhanoshin	dc928e9c37	[AMDGPU] Refactoring mfma intrinsic definitions. NFC. Differential Revision: https://reviews.llvm.org/D99366	2021-03-25 12:22:52 -07:00
Jamie Schmeiser	7f2ae3d55f	add print-change diff modes that do not use colour Summary: The colour characters currently added to the output of -print-changed=diff and -print-changed=diff-quiet cause difficulties when capturing the output and examining it in an editor. Change the function to not have the colour characters and add 2 new choices (-print-changed=cdiff and -print-changed=cdiff-quiet) to retain the existing functionality of adding the colour characters. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D97398	2021-03-25 10:35:27 -04:00
Abhina Sreeskantharajan	c83cd8feef	[NFC] Reordering parameters in getFile and getFileOrSTDIN In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used. ``` static ErrorOr<std::unique_ptr<MemoryBuffer>> getFile(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true, bool IsVolatile = false); static ErrorOr<std::unique_ptr<MemoryBuffer>> getFileOrSTDIN(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true); static ErrorOr<std::unique_ptr<MB>> getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset, bool IsText, bool RequiresNullTerminator, bool IsVolatile); static ErrorOr<std::unique_ptr<WritableMemoryBuffer>> getFile(const Twine &Filename, bool IsVolatile = false); ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D99182	2021-03-25 09:47:49 -04:00
Philip Reames	9a82f42d12	Plumb TLI through isSafeToExecuteUnconditionally [NFC] Split from D95815 to reduce patch size. Isn't (yet) used for anything, only the client side is wired up.	2021-03-24 17:52:04 -07:00
Sanjay Patel	adf42dff42	[ValueTracking] peek through min/max to find isKnownToBeAPowerOfTwo This is similar to the select logic just ahead of the new code. Min/max choose exactly one value from the inputs, so if both of those are a power-of-2, then the result must be a power-of-2. This might help with D98152, but we likely still need other pieces of the puzzle to avoid regressions. The change in PatternMatch.h is needed to build with clang. It's possible there is a better way to deal with the 'const' incompatibities. Differential Revision: https://reviews.llvm.org/D99276	2021-03-24 17:54:38 -04:00
Gulfem Savrun Yeniceri	5fbe1fdf17	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `5fd001a5ff` because it broke clang-with-thin-lto-ubuntu bot.	2021-03-24 18:59:33 +00:00
Gulfem Savrun Yeniceri	5fd001a5ff	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-24 17:31:18 +00:00
Thomas Preud'homme	3b52c04e82	Make FindAvailableLoadedValue TBAA aware FindAvailableLoadedValue() relies on FindAvailablePtrLoadStore() to run the alias analysis when searching for an equivalent value. However, FindAvailablePtrLoadStore() calls the alias analysis framework with a memory location for the load constructed from an address and a size, which thus lacks TBAA metadata info. This commit modifies FindAvailablePtrLoadStore() to accept an optional memory location as parameter to allow FindAvailableLoadedValue() to create it based on the load instruction, which would then have TBAA metadata info attached. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99206	2021-03-24 17:20:26 +00:00
Konstantin Zhuravlyov	f4ace63737	AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names Differential Revision: https://reviews.llvm.org/D95638	2021-03-24 11:54:05 -04:00
Sander de Smalen	55d18b3cc2	[TTI] Return a TypeSize from getRegisterBitWidth. This patch changes the interface to take a RegisterKind, to indicate whether the register bitwidth of a scalar register, fixed-width vector register, or scalable vector register must be returned. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98874	2021-03-24 14:45:13 +00:00
Anirudh Prasad	301d9261b7	[AsmParser][SystemZ][z/OS] Re-introduce HLASM comment syntax - https://reviews.llvm.org/rGb605cfb336989705f391d255b7628062d3dfe9c3 was reverted due to sanitizer bugs in the introduced unit-test (specifically in the Address sanitizer https://lab.llvm.org/buildbot/#/builders/5/builds/5697) - This patch attempts to rectify that, as well as re-factor parts of the test - The issue was previously, within the `setupCallToAsmParser` function in the unit-test, `SrcMgr` was declared as a local variable. `SrcMgr` owns a unique pointer. Since the variable goes out of scope at the end of the function, the unique pointer is released. - This patch, moves the declaration of the `SrcMgr` variable to a class field, since the scope will remain until the class's destructor is invoked (which in this case is at the end of the unit test) - Furthermore, this patch also moves the `MCContext Ctx` declaration from a local variable instance inside a function, to a unique pointer class field. This ensures the instantiation of the MCContext remains until the tear down of the test. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D99004	2021-03-24 10:17:00 -04:00
Joseph Huber	8140d0ec4a	[OpenMP] Change OMPIRBuilder to append function attributes Summary: Currently the OMPIRBuilder overwrites the function's existing attributes when it assigns the ones defined in OMPKinds.def. This changes the behaviour to append the current function's attributes with them instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D98740	2021-03-24 09:08:29 -04:00
Andrea Di Biagio	97a00b7b20	[MCA] Fix for uninitialised member in constructor. NFC	2021-03-24 11:21:59 +00:00
Florian Hahn	cd0c00c9fe	[LV] Move exact FP math check out of Requirements. We know if the loop contains FP instructions preventing vectorization after we are done with legality checks. This patch updates the code the check for un-vectorizable FP operations earlier, to avoid unnecessarily running the cost model and picking a vectorization factor. It also makes the code more direct and moves the check to a position where similar checks are done. I might be missing something, but I don't see any reason to handle this check differently to other, similar checks. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98633	2021-03-24 11:01:44 +00:00
Andrew Savonichev	292da93d59	[MCA] Disable RCU for InOrderIssueStage This is a follow-up for: D98604 [MCA] Ensure that writes occur in-order When instructions are aligned by the order of writes, they retire in-order naturally. There is no need for an RCU, so it is disabled. Differential Revision: https://reviews.llvm.org/D98628	2021-03-24 13:54:04 +03:00
Andy Wingo	c9801db2eb	[WebAssembly][MC] Record limit constraints for table sizes This commit adds a full WasmTableType to MCSymbolWasm, differing from the current situation (just an ElemType) in that it additionally records a WasmLimits. We add support for specifying the limits in .S files also, via the following syntax variations: .tabletype SYM, ELEMTYPE .tabletype SYM, ELEMTYPE, MINSIZE .tabletype SYM, ELEMTYPE, MINSIZE, MAXSIZE Depends on D99186. Differential Revision: https://reviews.llvm.org/D99191	2021-03-24 09:44:22 +01:00
Andy Wingo	9ac5620cb8	[WebAssembly] Rename WasmLimits::Initial to ::Minimum. NFC. This patch renames the "Initial" member of WasmLimits to the name used in the spec, "Minimum". In the core WebAssembly specification, the Limits data type has one required "min" member and one optional "max" member, indicating the minimum required size of the corresponding table or memory, and the maximum size, if any. Although the WebAssembly spec does instantiate locally-defined tables and memories with the initial size being equal to the minimum size, it can't impose such a requirement for imports. It doesn't make sense to require an initial size for a memory import, for example. The compiler can only sensibly express the minimum and maximum sizes. See https://github.com/WebAssembly/js-types/blob/master/proposals/js-types/Overview.md#naming-of-size-limits for a related discussion that agrees that the right name of "initial" is "minimum" when querying the type of a table or memory from JavaScript. (Of course it still makes sense for JS to speak in terms of an initial size when it explicitly instantiates memories and tables.) Differential Revision: https://reviews.llvm.org/D99186	2021-03-24 09:10:11 +01:00
Alex Orlov	876435c487	* Fix demangling of optional template-args for vendor extended type qualifier. This fixes https://bugs.llvm.org/show_bug.cgi?id=48009 bug. Reviewed By: erik.pilkington, krisb Differential Revision: https://reviews.llvm.org/D98687	2021-03-24 10:21:32 +04:00
Chuanqi Xu	3b83590cb2	[NFC] [Support] Fix unconsistent comment with codes for ExtendSigned	2021-03-24 13:58:54 +08:00
Max Kazantsev	85cbfe75af	[NFC] Fix comment describing what EdgeBundles is The original comment says the same thing twice, and does not mention that edges entering the block are also in the same bundle (which seems true from what the underlying code is doing). Differential Revision: https://reviews.llvm.org/D99144 Reviewed By: RKSimon	2021-03-24 11:04:05 +07:00
Serguei Katkov	311d81ce97	[RegAlloc] Fix "ran out of regs" with uses in statepoint Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live intervals in the way that all physical registers are occupied by "zero-length" live intervals which are marked as not-spillable. While intervals are marked as not-spillable in the moment of creation when they are really zero-length it is possible that in future as part of re-materialization there will need for physical register between def and use of such tiny interval (the use is not related to this interval at all). As all physical registers are assigned to not-spillable intervals there is not avaialbe registers and RA reports an error. The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint instruction in var args section. Such interval may be perfectly spilled and folded to operand of statepoint. Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98766	2021-03-24 10:25:34 +07:00
Choongwoo Han	772e1dd1dd	[Coverage] Load records immediately The current implementation keeps buffers generated for each object file until it completes loading of all files. This approach requires a lot of memory if there are a lot of huge object files. Thus, make it to load coverage records immediately rather than waiting for other binaries to be loaded. This reduces memory usage of llvm-cov from >128GB to 5GB when loading Chromium binaries in Windows. Additional testing: check-profile, check-llvm Differential Revision: https://reviews.llvm.org/D99110	2021-03-23 16:25:20 -07:00
Rafael Auler	53196387c2	Add register size info back to MCRegisterClass This patch addresses the removal of register size information done in commit `c8b782c`. Without this change, there is no viable option to get register size information outside libTarget. We need this information to run analysis that know the register size from the MC layer, used by BOLT. Discussion D50285 and D47199. Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D97891	2021-03-23 15:04:44 -07:00
Alexey Bataev	99203f2004	[Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967	2021-03-23 14:25:36 -07:00
Alexey Bataev	f1b47ad278	Revert "[Analysis]Add getPointersDiff function to improve compile time." This reverts commit `065a14a12d` to investigate and fix crash in SLP vectorizer.	2021-03-23 13:17:54 -07:00
Alexey Bataev	065a14a12d	[Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967	2021-03-23 12:58:42 -07:00
Tony	c181724a9b	[NFC][AMDGPU] Reserve AMD GPU ELF machine number 0x41 Reviewed By: foad Differential Revision: https://reviews.llvm.org/D99196	2021-03-23 17:53:02 +00:00
Nathan James	a0f48d57a9	[NFC] Enable RVALUE_REFERENCE_THIS on MSVC 2019 In https://reviews.llvm.org/D72948 This was enabled for all MSVC but reverted as it was determined not to work on some 2017 versions. The issue is assumed to be fixed on 2019 so enable for 2019 and newer. Some testing could be done to determine which version of MSVC 2017 support this feature but its safer right now to leave it at 2019. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D98809	2021-03-23 16:40:13 +00:00
Andrea Di Biagio	f5bdc88e4d	[MCA] Improved handling of negative read-advance cycles. Before this patch, register writes were always invalidated by the RegisterFile at instruction commit stage. So, the RegisterFile was often losing the knowledge about the `execute cycle` of writes already committed. While this was not problematic for non-delayed reads, this was sometimes leading to inaccurate read latency computations in the presence of negative read-advance cycles. This patch fixes the issue by changing how the RegisterFile component internally keeps track of the `execute cycle` information of each write. On every instruction executed, the RegisterFile gets notified by the RetireStage, so that it can internally record the execute cycle of each executed write. The `execute cycle` information is stored within WriteRef itself, and it is not invalidated when the write is committed.	2021-03-23 14:47:23 +00:00
Jamie Schmeiser	64336d3421	Revert "A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash." This reverts commit `9544a32287`.	2021-03-23 10:09:27 -04:00
Jamie Schmeiser	9544a32287	A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash. Summary: The IR is saved in its print form before each pass is started and a signal handler is registered. If the compilation crashes, the signal handler will print the saved IR to dbgs(). This option can be modified using -print-module-scope to get the IR for the complete module. Note that this option only works with the new pass manager. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D86657	2021-03-23 09:29:17 -04:00
serge-sans-paille	e19884cd74	Introduce a generic operator to apply complex operations to BitVector This avoids temporary and memcpy call when computing large expressions. It's basically some kind of poor man's expression template, but it seems easier to maintain to have a single generic `apply` call instead of the whole expression template machinery here. Differential Revision: https://reviews.llvm.org/D98176	2021-03-23 14:23:26 +01:00
Yvan Roux	241032a205	[llvm-symbolizer][llvm-nm] Fix AArch64 and ARM mapping symbols handling. Exclude AArch64 mapping symbols ($x and $d) for symtab symbolization as it was done for ARM since D95916 tom bring bots back to green state. This is implemented by setting SF_FormatSpecific such that llvm-symbolizer will ignore them, and use this flag to re-implement llvm-nm --special-syms option which make it work for both targets. Differential Revision: https://reviews.llvm.org/D98803	2021-03-23 14:17:12 +01:00
Valentin Clement	d709dcc090	[openacc][openmp] Reduce number of generated file and prefer inclusion of .inc Follow up from D92955 and D83636. This patch makes the base cpp files OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file generated by tablegen. This reduces the number of file generated by the DirectiveEmitter backend and makes it closer to the proposal in D83636. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D93560	2021-03-23 09:16:53 -04:00
Matt Arsenault	b24436ac96	GlobalISel: Lower funnel shifts	2021-03-23 09:11:17 -04:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Pushpinder Singh	d0e5422eb8	[GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO Reviewed By: foad Differential Revision: https://reviews.llvm.org/D93963	2021-03-23 05:45:43 +00:00
Rahman Lavaee	949abf7d6a	[llvm-readelf, propeller] Add fallthrough bit to basic block metadata in BB-Address-Map section. This patch adds a fallthrough bit to basic block metadata, indicating whether the basic block can fallthrough without taking any branches. The bit will help us avoid an intel LBR bug which results in occasional duplicate entries at the beginning of the LBR stack. This patch uses `MachineBasicBlock::canFallThrough()` to set the bit. This is not a const method because it eventually calls `TargetInstrInfo::analyzeBranch`, but it calls this function with the default `AllowModify=false`. So we can either make the argument to the `getBBAddrMapMetadata` non-const, or we can use `const_cast` when calling `canFallThrough`. I decide to go with the latter since this is purely due to legacy code, and in general we should not allow the BasicBlock to be mutable during `getBBAddrMapMetadata`. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D96918	2021-03-22 21:38:05 -07:00
Tony	1e04706adb	[AMDGPU] Reserve ELF code Reserve AMD GPU ELF machine code 0x040. Minor AMDGPUUsage format consistency change. Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D99122	2021-03-23 04:30:38 +00:00
Gulfem Savrun Yeniceri	e3a6d70c68	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `78a65cd945` which caused buildbot failures.	2021-03-23 00:43:16 +00:00
Juneyoung Lee	5c2e50b5d2	Reland "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe" This relands commit `99108c791d` (D95026) which was reverted by `8d5a981a13` because the underlying problem (https://llvm.org/pr49495) is fixed.	2021-03-23 09:19:53 +09:00
Gulfem Savrun Yeniceri	78a65cd945	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-22 22:09:02 +00:00
Roman Lebedev	d37fe26a2b	[NFC][IR] Type: add getWithNewType() method Sometimes you want to get a type with same vector element count as the current type, but different element type, but there's no QOL wrapper to do that. Add one.	2021-03-23 00:50:58 +03:00
Nikita Popov	7e18cd887c	[InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced This is an alternative to D98391/D98585, playing things more conservatively. If AllowRefinement == false, then we don't use InstSimplify methods at all, and instead explicitly implement a small number of non-refining folds. Most cases are handled by constant folding, and I only had to add three folds to cover our unit tests / test-suite. While this may lose some optimization power, I think it is safer to approach from this direction, given how many issues this code has already caused. Differential Revision: https://reviews.llvm.org/D99027	2021-03-22 22:12:56 +01:00
Nikita Popov	ca28e32359	[IR] Mark assume/annotation as InaccessibleMemOnly These intrinsics don't need to be marked as arbitrary writing, it's sufficient to write inaccessible memory (aka "side effect") to preserve control dependencies. This means less special-casing in BasicAA. This is intended as an alternative to D98925. Differential Revision: https://reviews.llvm.org/D99022	2021-03-22 22:01:03 +01:00
Sanjay Patel	664d0c052c	[TargetTransformInfo] move branch probability query from TargetLoweringInfo This is no-functional-change intended (NFC), but needed to allow optimizer passes to use the API. See D98898 for a proposed usage by SimplifyCFG. I'm simplifying the code by removing the cl::opt. That was added back with the original commit in D19488, but I don't see any evidence in regression tests that it was used. Target-specific overrides can use the usual patterns to adjust as necessary. We could also restore that cl::opt, but it was not clear to me exactly how to do it in the convoluted TTI class structure.	2021-03-22 15:55:34 -04:00
Matt Arsenault	9fdfd8dd52	GlobalISel: Add utility function to constant fold FP ops	2021-03-22 14:38:17 -04:00
Lang Hames	cc4ad2c540	[JITLink][ELF/x86-64] Add support for GOTOFF64 relocation.	2021-03-22 10:40:50 -07:00
Stefan Gränitz	50e499a56d	[Orc] Fix copy elision warning in RPCUtils The `callB()` template function always moved errors on return, because in the majority of cases its return type is an `Expected<T>` and the error must be moved into the implicit ctor. For the special case of a `void` result, however, the `ResultTraits` class is specialized and the return type is a raw `Error`. Some build bots complain, that in favor of NRVO errors should not be moved in this case. ``` llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1513:27: llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1519:27: llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1526:29: warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move] ``` The warning is reasonable from a type-system point of view. For performance it's entirely insignificant. Differential Revision: https://reviews.llvm.org/D98947	2021-03-22 17:47:33 +01:00
Stefan Gränitz	c154cddabd	[Orc] Fix tracking of pending debug objects in DebugObjectManagerPlugin There can be multiple MaterializationResponsibilitys in-flight for a single ResourceKey. Hence, pending debug objects must be tracked by MaterializationResponsibility and not by ResourceKey. Differential Revision: https://reviews.llvm.org/D98785	2021-03-22 17:47:32 +01:00
Philip Reames	d4648eeaa2	[SCEV] Use trip count information to improve shift recurrence ranges This patch exploits the knowledge that we may be running many fewer than bitwidth iterations of the loop, and may be able to disallow the overflow case. This patch specifically implements only the shl case, but this can be generalized to ashr and lshr without difficulty. Differential Revision: https://reviews.llvm.org/D98222	2021-03-22 09:38:43 -07:00
Philip Reames	9c16621c0d	Clarify comments on recurrence matcher [NFC] Triggered by discussion on D98222. The case where we have a loop variant step is suprising, and doesn't match the behavior of SCEV's recurrences. As such, make sure we call that out explicitly.	2021-03-22 09:23:06 -07:00
Wenlei He	ce6bfe9411	[CSSPGO][llvm-profgen] Use profile summary based threshold for context trimming and merging Switch to use cold threshold from profile summary for cold context merging and trimming, instead of relying on hard coded values. Minor refactoring included for switch names, etc. Differential Revision: https://reviews.llvm.org/D98921	2021-03-22 08:56:59 -07:00
Alexey Lapshin	972b6a3a34	[llvm-objcopy][Support] move writeToOutput helper function to Support. writeToOutput function is useful when it is necessary to create different kinds of streams(based on stream name) and when we need to use a temporary file while writing(which would be renamed into the resulting file in a success case). This patch moves the writeToStream helper into the Support library. Differential Revision: https://reviews.llvm.org/D98426	2021-03-22 15:41:10 +03:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Lang Hames	fc36a511c6	[JITLink][ELF/x86-64] Add support for R_X86_64_GOTPC64 and R_X86_64_GOT64. Start adding support for ELF x86-64 large code model, PIC relocations.	2021-03-21 21:52:54 -07:00
Lang Hames	0a74ec3299	[JITLink] Start laying the groundwork for ELF x86-64 large code model support. Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template for a JITLink pass that transforms external symbols meeting a user-supplied predicate into defined symbols pointing at the start and end of a Section identified by the predicate. JITLink.h is updated with a new makeAbsolute function to support this pass. Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new name better describes the intent of this GOT and PLT stubs builder, and will help to distinguish it from future GOT and PLT stub builders that build entries that may be shared between multiple graphs.	2021-03-21 20:56:47 -07:00
Lang Hames	209ceed745	[JITLink][ELF/x86-64] Add Delta32, NegDelta32, NegDelta64 support. These were missing, but are used in eh-frame section support.	2021-03-21 20:15:40 -07:00
Roman Lebedev	e3a4701627	[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights `08196e0b2e` exposed LowerExpectIntrinsic's internal implementation detail in the form of LikelyBranchWeight/UnlikelyBranchWeight options to the outside. While this isn't incorrect from the results viewpoint, this is suboptimal from the layering viewpoint, and causes confusion - should transforms also use those weights, or should they use something else, D98898? So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight internal again, and fixing all the code that used it directly, which currently is only clang codegen, thankfully, to emit proper @llvm.expect intrinsics instead.	2021-03-21 22:50:21 +03:00
Roman Lebedev	37d6be9052	Revert "[BranchProbability] move options for 'likely' and 'unlikely'" Upon reviewing D98898 i've come to realization that these are implementation detail of LowerExpectIntrinsicPass, and they should not be exposed to outside of it. This reverts commit `ee8b53815d`.	2021-03-21 22:50:21 +03:00
Matt Arsenault	20a24af01d	MIR: Fix missing serialization for HasTailCall	2021-03-21 13:14:04 -04:00
Sanjay Patel	ee8b53815d	[BranchProbability] move options for 'likely' and 'unlikely' This makes the settings available for use in other passes by housing them within the Support lib, but NFC otherwise. See D98898 for the proposed usage in SimplifyCFG (where this change was originally included). Differential Revision: https://reviews.llvm.org/D98945	2021-03-20 14:46:46 -04:00
Jeroen Dobbelaere	77080a1eb6	Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types. Now that intrinsic name mangling can cope with unnamed types, the custom name mangling in PredicateInfo (introduced by D49126) can be removed. (See D91250, D48541) Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91661	2021-03-20 11:37:09 +01:00
Shao-Ce Sun	4d11baab25	[NFC][ValueTypes] Align code by column Adjusted some whitespaces. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98975	2021-03-20 13:43:07 +08:00
Lang Hames	f380066461	[JITLink] Remove redundant local variable definitions from a unit test.	2021-03-19 18:29:36 -07:00
Ellis Hoag	d90270e9e8	Port D97640 to llvm/include/llvm/ProfileData/InstrProfData.inc Differential Revision: https://reviews.llvm.org/D98982	2021-03-19 16:24:16 -07:00
Christoffer Lernö	528f6f7d61	Add type attributes to LLVM C API The LLVM C API is missing type attributes as is needed by attributes such as sret and byval. This patch adds three missing wrapper functions. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=48249 https://reviews.llvm.org/D97763	2021-03-19 19:07:04 -04:00
Jessica Paquette	4773dd5ba9	[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes) There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG. E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain code that matches a bitfield extract from an and + right shift. Rather than duplicating code in the same way, this adds two opcodes: - G_UBFX (unsigned bitfield extract) - G_SBFX (signed bitfield extract) They work like this ``` %x = G_UBFX %y, %lsb, %width ``` Where `lsb` and `width` are - The least-significant bit of the extraction - The width of the extraction This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends the result, while G_SBFX sign-extends the result. This should allow us to use the combiner to match the bitfield extraction patterns rather than duplicating pattern-matching code in each target. Differential Revision: https://reviews.llvm.org/D98464	2021-03-19 14:37:19 -07:00
Fangrui Song	948be862d6	[llvm-readobj] Remove legacy GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} and dump new GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} https://sourceware.org/bugzilla/show_bug.cgi?id=26703 deprecated the previous GNU_PROPERTY_X86_ISA_1_{CMOV,SSE,*} values (renamed to `COMPAT`) and added new values. Since the legacy values are not used by compilers, having dumping support in llvm-readobj is unnecessary. So just drop the legacy feature. The new values are used by GCC 11 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250) `-march=x86-64-v[234]` to indicate the micro-architecture ISA levels. Differential Revision: https://reviews.llvm.org/D98818	2021-03-19 14:35:22 -07:00
Philip Reames	5698537f81	Update basic deref API to account for possiblity of free [NFC] This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree". The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet. Differential Revision: https://reviews.llvm.org/D98908	2021-03-19 11:17:19 -07:00
Alexey Bataev	14ae0cf0f5	[Cost]Canonicalize the cost for logical or/and reductions. The generic cost of logical or/and reductions should be cost of bitcast <ReduxWidth x i1> to iReduxWidth + cmp eq\|ne iReduxWidth. Differential Revision: https://reviews.llvm.org/D97961	2021-03-19 11:01:58 -07:00
Paul C. Anagnostopoulos	a9fc44c557	[TableGen] Improve handling of template arguments This requires changes to TableGen files and some C++ files due to incompatible multiclass template arguments that slipped through before the improved handling.	2021-03-19 09:57:53 -04:00
Jeroen Dobbelaere	04790d9cfb	Support intrinsic overloading on unnamed types This patch adds support for intrinsic overloading on unnamed types. This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484). The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types. This can result in identical intrinsic mangled names for different function prototypes. This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name. Implementation details: - The mapping is created on demand and kept in Module. - It also checks for existing clashes and recycles potentially existing prototypes and declarations. - Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument. - I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name). -- Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is -- The current situation already has a limitation. So that should not get worse with this patch. - Intrinsic::getDeclaration and the verifier are now using the new version. Other notes: - As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct - The initial count starts from 0 for each intrinsic mangled name. - In case of name clashes, existing prototypes are remembered and reused when that makes sense. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91250	2021-03-19 14:34:25 +01:00
Abhina Sreeskantharajan	4f750f6ebc	[SystemZ][z/OS] Distinguish between text and binary files on z/OS This patch consists of the initial changes to help distinguish between text and binary content correctly on z/OS. I would like to get feedback from Windows users on setting OF_None for all ToolOutputFiles. This seems to have been done as an optimization to prevent CRLF translation on Windows in the past. Reviewed By: zibi Differential Revision: https://reviews.llvm.org/D97785	2021-03-19 08:09:57 -04:00
Simon Pilgrim	a96897219d	[KnownBits] Add knownbits analysis for mulhs/mulu 'multiply high' instructions Split off from D98857 https://reviews.llvm.org/D98866	2021-03-19 08:56:06 +00:00
Wenlei He	1410db70b9	[CSSPGO] Add attribute metadata for context profile This changes adds attribute field for metadata of context profile. Currently we have an inline attribute that indicates whether the leaf frame corresponding to a context profile was inlined in previous build. This will be used to help estimating inlining and be taken into account when trimming context. Changes for that in llvm-profgen will follow. It will also help tuning. Differential Revision: https://reviews.llvm.org/D98823	2021-03-18 22:00:56 -07:00
Philip Reames	fa26da0582	Add a couple of missing attribute query methods [NFC]	2021-03-18 17:33:20 -07:00
Yuanfang Chen	b4a8c0ebb6	[LTO][MC] Discard non-prevailing defined symbols in module-level assembly This is the alternative approach to D96931. In LTO, for each module with inlineasm block, prepend directive ".lto_discard <sym>, <sym>*" to the beginning of the inline asm. ".lto_discard" is both a module inlineasm block marker and (optionally) provides a list of symbols to be discarded. In MC while emitting for inlineasm, discard symbol binding & symbol definitions according to ".lto_disard". Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98762	2021-03-18 15:33:42 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Mike Rice	c2f8e158f5	[OPENMP51]Support for the 'destroy' clause with interop variable. Added basic parsing/sema/serialization support to extend the existing 'destroy' clause for use with the 'interop' directive. Differential Revision: https://reviews.llvm.org/D98834	2021-03-18 09:12:56 -07:00
Andrew Savonichev	e6ce0db378	[MCA] Ensure that writes occur in-order Delay the issue of a new instruction if that leads to out-of-order commits of writes. This patch fixes the problem described in: https://bugs.llvm.org/show_bug.cgi?id=41796#c3 Differential Revision: https://reviews.llvm.org/D98604	2021-03-18 17:10:20 +03:00
Matt Arsenault	b9a0384983	GlobalISel: Preserve source value information for outgoing byval args Pass through the original argument IR value in order to preserve the aliasing information in the memcpy memory operands.	2021-03-18 09:16:54 -04:00
Matt Arsenault	61f834cc09	GlobalISel: Insert memcpy for outgoing byval arguments byval requires an implicit copy between the caller and callee such that the callee may write into the stack area without it modifying the value in the parent. Previously, this was passing through the raw pointer value which would break if the callee wrote into it. Most of the time, this copy can be optimized out (however we don't have the optimization SelectionDAG does yet). This will trigger more fallbacks for AMDGPU now, since we don't have legalization for memcpy yet (although we should stop using byval anyway).	2021-03-18 09:16:54 -04:00
Max Kazantsev	b3a1500ea8	[SCEV][NFC] API for predicate evaluation Provides API that allows to check predicate for being true or false with one call. Current implementation is naive and just calls isKnownPredicate twice, but further we can rework this logic trying to use one check to prove both facts.	2021-03-18 19:21:29 +07:00
Sanjay Patel	c8893f3b78	[LoopVectorize] relax FMF constraint for FP induction This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are some that clearly are not in play like arcp or afn. If we want to make FMF constraints consistent across the IR optimizer, we might want to add nsz too, but that's up for debate (users can't expect associative FP math and preservation of sign-of-zero at the same time?). The calling code was fixed to avoid miscompiles with: `1bee549737` Differential Revision: https://reviews.llvm.org/D98708	2021-03-18 08:11:22 -04:00
Lang Hames	0604e0bc68	[JITLink] Reformat an enum.	2021-03-17 21:43:53 -07:00
Lang Hames	86ec3fd9d9	[JITLink] Improve out-of-range error messages. Switches all backends to use the makeTargetOutOfRangeError function from JITLink.h.	2021-03-17 21:35:24 -07:00
Chen Zheng	d33b016ada	[XCOFF][llvm-dwarfdump] llvm-dwarfdump support for XCOFF Author: hubert.reinterpretcast, shchenz Reviewed By: jasonliu, echristo Differential Revision: https://reviews.llvm.org/D97186	2021-03-17 21:21:51 -04:00
Joel E. Denny	dd59c1324d	[FileCheck] Fix numeric error propagation A more general name might be match-time error propagation. That is, it's conceivable we'll one day have non-numeric errors that require the handling fixed by this patch. Without this patch, FileCheck behaves as follows: ``` $ cat check CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] $ FileCheck -vv -dump-input=never check < input check:1:54: remark: implicit EOF: expected string found in input CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] ^ <stdin>:2:1: note: found here ^ check:1:15: error: unable to substitute variable or numeric expression: overflow error CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] ^ $ echo $? 0 ``` Notice that the exit status is 0 even though there's an error. Moreover, FileCheck doesn't print the error diagnostic unless both `-dump-input=never` and `-vv` are specified. The same problem occurs when `CHECK-NOT` does have a match but a capture fails due to overflow: exit status is 0, and no diagnostic is printed unless both `-dump-input=never` and `-vv` are specified. The usefulness of capturing from `CHECK-NOT` is questionable, but this case should certainly produce an error. With this patch, FileCheck always includes the error diagnostic and has non-zero exit status for the above examples. It's conceivable that this change will cause some existing tests to fail, but my assumption is that they should fail. Moreover, with nearly every project enabled, this patch didn't produce additional `check-all` failures for me. This patch also extends input dumps to include such numeric error diagnostics for both expected and excluded patterns. As noted in fixmes in some of the tests added by this patch, this patch worsens an existing issue with redundant diagnostics. I'll fix that bug in a subsequent patch. Reviewed By: thopre, jhenderson Differential Revision: https://reviews.llvm.org/D98086	2021-03-17 19:25:41 -04:00
Mike Rice	c615927c8e	[OPENMP51]Initial support for the use clause. Added basic parsing/sema/serialization support for the 'use' clause. Differential Revision: https://reviews.llvm.org/D98815	2021-03-17 15:46:14 -07:00
Amara Emerson	d7fed7b899	[AArch64][GlobalISel] Fall back if disabling neon/fp in the translator. The previous technique relied on early-exiting the legalizer predicate initialization, leaving an empty rule table. That causes a fallback for most instructions, but some have legacy rules defined like G_ZEXT which can try continue, but then crash. We should fall back earlier, in the translator, to avoid this issue. Differential Revision: https://reviews.llvm.org/D98730	2021-03-17 15:08:08 -07:00
Philip Reames	31764ea295	[LCSSA] Extract a utility for deciding if a new use requires a new lcssa phi [NFC] (Triggered by a review comment on D98728, but otherwise unrelated.)	2021-03-17 12:14:01 -07:00
David Green	e2935dcfc4	[TTI] Add a Mask to getShuffleCost This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask can be provided a more accurate cost can be provided by the backend. For example VREV costs could be returned by the ARM backend. This should be an NFC until then, laying the groundwork for that to be added. Differential Revision: https://reviews.llvm.org/D98206	2021-03-17 17:46:26 +00:00
Stephen Tozer	3bfddc2593	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit `01ac6d1587`.	2021-03-17 16:45:25 +00:00
Mike Rice	410f09af09	[OPENMP51]Initial support for the interop directive. Added basic parsing/sema/serialization support for interop directive. Support for the 'init' clause. Differential Revision: https://reviews.llvm.org/D98558	2021-03-17 09:42:07 -07:00
Simon Pilgrim	cfc256ba9f	[DAG] TargetLowering::isBinOp() - add ISD::SSUBSAT/USUBSAT Add to the generic non-commutative binop list.	2021-03-17 14:51:00 +00:00
Alexey Lapshin	021de7cf80	[llvm-objcopy][NFC] Move ownership keeping code into restoreStatOnFile(). The D93881 added functionality which preserve ownership for output file if llvm-objcopy is called under root. That code was added into the place where output file is created. The llvm-objcopy already has a function which sets/restores rights/permissions for the output file. That is the restoreStatOnFile() function. This patch moves code (preserving ownershipping) into the restoreStatOnFile() function. Differential Revision: https://reviews.llvm.org/D98511	2021-03-17 17:27:00 +03:00
Hans Wennborg	01ac6d1587	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit `df69c69427`.	2021-03-17 13:36:48 +01:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Fangrui Song	5bd6b0a62b	[MC] Delete unused MCOperand::{create,is,get}FPImm	2021-03-17 00:30:38 -07:00
Max Kazantsev	a6074b092c	[BasicAA] Drop dependency on Loop Info. PR43276 BasicAA stores a reference to LoopInfo inside. This imposes an implicit requirement of keeping it up to date whenever we modify the IR (in particular, whenever we modify terminators of blocks that belong to loops). Failing to do so leads to incorrect state of the LoopInfo. Because general AA does not require loop info updates and provides to API to update it properly, the users of AA reasonably assume that there is no need to update the loop info. It may be a reason of bugs, as example in PR43276 shows. This patch drops dependence of BasicAA on LoopInfo to avoid this problem. This may potentially pessimize the result of queries to BasicAA. Differential Revision: https://reviews.llvm.org/D98627 Reviewed By: nikic	2021-03-17 11:43:44 +07:00
Anirudh Prasad	9f5da80013	Revert "[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax"" This reverts commit `b605cfb336`. Differential Revision: https://reviews.llvm.org/D98744	2021-03-16 18:39:04 -04:00
Fangrui Song	5d037458a3	[RISCV] Make empty name symbols SF_FormatSpecific so that llvm-symbolizer ignores them for symbolization On RISC-V, clang emits empty name symbols used for label differences. (In GCC the symbols are typically `.L0`) After D95916, the empty name symbols can show up in llvm-symbolizer's symbolization output. They have no names and thus not useful. Set `SF_FormatSpecific` so that llvm-symbolizer will ignore them. `SF_FormatSpecific` is also used in LTO but that case should not matter. Corresponding addr2line problem: https://sourceware.org/bugzilla/show_bug.cgi?id=27585 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98669	2021-03-16 14:12:18 -07:00
Anirudh Prasad	b605cfb336	[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax" - Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 \| reverted ]] as it broke when building the unit tests when shared libs on. - This patch reverts the "revert" and makes two minor changes - The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off - The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests" Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D98666	2021-03-16 17:11:46 -04:00
Nikita Popov	40bc309911	Revert "[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration" This reverts commit `d40b4911bd`. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=0aa637b2037d882ddf7861284169abf63f524677&to=d40b4911bd9aca0573752e065f29ddd9aff280e1&stat=instructions	2021-03-16 20:41:26 +01:00
Mircea Trofin	d40b4911bd	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-03-16 12:10:10 -07:00
Nick Lewycky	b743bbc505	Add ConstantDataVector::getRaw() to create a constant data vector from raw data. This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types. Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module. In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat. Differential Revision: https://reviews.llvm.org/D98302	2021-03-16 11:57:53 -07:00
Aaron Puchert	1cb15b10ea	Correct Doxygen syntax for inline code There is no syntax like {@code ...} in Doxygen, @code is a block command that ends with @endcode, and generally these are not enclosed in braces. The correct syntax for inline code snippets is @c <code>. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D98665	2021-03-16 15:17:45 +01:00
serge-sans-paille	b2e78a061c	[NFC] Use SmallString instead of std::string for the AttrBuilder This avoids a few unnecessary conversion from StringRef to std::string, and a bunch of extra allocation thanks to the SmallString. Differential Revision: https://reviews.llvm.org/D98190	2021-03-16 13:34:14 +01:00
Caroline Concatto	3c03635d53	[SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse This patch adds support for reverse loop vectorization. It is possible to vectorize the following loop: ``` for (int i = n-1; i >= 0; --i) a[i] = b[i] + 1.0; ``` with fixed or scalable vector. The loop-vectorizer will use 'reverse' on the loads/stores to make sure the lanes themselves are also handled in the right order. This patch adds support for scalable vector on IRBuilder interface to create a reverse vector. The IR function CreateVectorReverse lowers to experimental.vector.reverse for scalable vector and keedp the original behavior for fixed vector using shuffle reverse. Differential Revision: https://reviews.llvm.org/D95363	2021-03-16 07:51:59 +00:00
Bing1 Yu	4f198b0c27	[X86] Pass to transform amx intrinsics to scalar operation. This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D93594	2021-03-16 10:40:22 +08:00
Lang Hames	ecf6466f01	[JITLink][MachO][x86-64] Introduce generic x86-64 support. This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64 backend to use these edge kinds. This simplifies the implementation of the MachO/x86-64 backend and makes it possible to write generic x86-64 passes and utilities. The new edge kinds are different from the original set used in the MachO/x86-64 backend. Several edge kinds that were not meaningfully distinguished in that backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in the new scheme (these edge kinds can be reintroduced later if we find a use for them). At the same time, new edge kinds have been introduced to convey extra information about the state of the graph. E.g. The RequestAndTransformTo* edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP entries, and the 'Relaxable' suffix distinguishes edges that are candidates for optimization from edges which should be left as-is (e.g. to enable runtime redirection). ELF/x86-64 will be refactored to use these generic edges at some point in the future, and I anticipate a similar refactor to create a generic arm64 support header too. Differential Revision: https://reviews.llvm.org/D98305	2021-03-15 15:43:07 -07:00
Nick Lewycky	483a253ae9	NFC: Formatting changes. Run clang-format over these files. Capitalize some variable names per clang-tidy's request. Pulled out to simplify review of D98302.	2021-03-15 14:26:39 -07:00
Florian Hahn	bb244ea2a8	[AnnotationRemarks] Remove unneeded Function.h include (NFC).	2021-03-15 21:09:35 +00:00
Wenlei He	a5d30421a6	[CSSPGO] Load context profile for external functions in PreLink and populate ThinLTO import list For ThinLTO's prelink compilation, we need to put external inline candidates into an import list attached to function's entry count metadata. This enables ThinLink to treat such cross module callee as hot in summary index, and later helps postlink to import them for profile guided cross module inlining. For AutoFDO, the import list is retrieved by traversing the nested inlinee functions. For CSSPGO, since profile is flatterned, a few things need to happen for it to work: - When loading input profile in extended binary format, we need to load all child context profile whose parent is in current module, so context trie for current module includes potential cross module inlinee. - In order to make the above happen, we need to know whether input profile is CSSPGO profile before start reading function profile, hence a flag for profile summary section is added. - When searching for cross module inline candidate, we need to walk through the context trie instead of nested inlinee profile (callsite sample of AutoFDO profile). - Now that we have more accurate counts with CSSPGO, we swtiched to use entry count instead of total count to decided if an external callee is potentially beneficial to inline. This make it consistent with how we determine whether call tagert is potential inline candidate. Differential Revision: https://reviews.llvm.org/D98590	2021-03-15 12:22:15 -07:00
Fangrui Song	5d44c92bf8	Change void getNoop(MCInst &NopInst) to MCInst getNop() Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.	2021-03-15 12:05:34 -07:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Zahira Ammarguellat	80ca4fd154	[NFC] Fix "unused parameter" error revealed in the Linux self-build.	2021-03-15 12:17:11 -04:00
Jan Kratochvil	a28facba1c	[llvm] [dwarf] Fix DWARFListTableHeader::getOffsetEntry off-by-one D98289 was erroneously reporting `invalid range list offset 0x20110` instead of `invalid range list table index 0`. Differential Revision: https://reviews.llvm.org/D98589	2021-03-14 21:42:44 +01:00
Matt Arsenault	d57d8f364f	CodeGen: Reorder MachinePointerInfo fields This saves a little bit of padding.	2021-03-14 10:06:39 -04:00
kuterd	44c1425c17	[Attributor][fix] Remove problematic EXPENSIVE_CHECK Remove the check that is causing compilation issues in some build configurations.	2021-03-13 18:03:24 +03:00
Lang Hames	4e30b20bdb	[JITLink][ORC] Make the LinkGraph available to modifyPassConfig. This makes the target triple, graph name, and full graph content available when making decisions about how to populate the linker pass pipeline. Also updates the LLJITWithObjectLinkingLayerPlugin example to show more API use, including use of the API changes in this patch.	2021-03-12 18:42:51 -08:00
Hubert Tong	4f9cc1512d	Revert "[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax" This reverts commit `bcdd40f802`. See https://reviews.llvm.org/D98543.	2021-03-12 14:48:00 -05:00
Zahira Ammarguellat	c2006f857d	[NFC] Fix "unused parameter" error revealed in the Linux self-build.	2021-03-12 10:26:40 -08:00
Anirudh Prasad	bcdd40f802	[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax - This patch adds in support for the ordinary HLASM comment syntax asm statements (Reference - Chapter 7, Comment Statements, Ordinary Comment Statements) - In brief, the ordinary comment syntax if used, must begin with the "" character - To achieve this, this patch makes use of the CommentString attribute provided in the base MCAsmInfo class - In the SystemZMCAsmInfo class, the CommentString attribute was set to "" based on the assembler dialect - Furthermore, a new attribute RestrictCommentString, is provided to only treat a string as a comment if it appears at the start of the asm statement. Example: "jo -4" is valid in HLASM (jump back 4 bytes from current point - similar to jo -4 in gnu asm) and we don't want "-4" to be treated as a comment. - RFC for HLASM Parser support implementation: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html Reviewed By: scott.linder, Kai Differential Revision: https://reviews.llvm.org/D97703	2021-03-12 11:56:11 -05:00
Matt Arsenault	6b76d82853	GlobalISel: Fix marking byval arguments as immutable byval arguments need to be assumed writable. Only implicitly stack passed arguments which aren't addressable in the IR can be assumed immutable. Mips is still broken since for some reason its doing its own thing with the ValueHandlers (and x86 doesn't actually handle byval arguments now, although some of the code is there).	2021-03-12 09:01:53 -05:00
serge-sans-paille	bc4a5bdce4	[NFC] Use StringRef instead of const char* for AsmPrinter This avoids calling strlen to repeatedly compute some string size.	2021-03-12 14:39:25 +01:00
Hans Wennborg	f50aef745c	Revert "[InstrProfiling] Don't generate __llvm_profile_runtime_user" This broke the check-profile tests on Mac, see comment on the code review. > This is no longer needed, we can add __llvm_profile_runtime directly > to llvm.compiler.used or llvm.used to achieve the same effect. > > Differential Revision: https://reviews.llvm.org/D98325 This reverts commit `c7712087cb`. Also reverting the dependent follow-up commit: Revert "[InstrProfiling] Generate runtime hook for ELF platforms" > When using -fprofile-list to selectively apply instrumentation only > to certain files or functions, we may end up with a binary that doesn't > have any counters in the case where no files were selected. However, > because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the > runtime would still be pulled in and incur some non-trivial overhead, > especially in the case when the continuous or runtime counter relocation > mode is being used. A better way would be to pull in the profile runtime > only when needed by declaring the __llvm_profile_runtime symbol in the > translation unit only when needed. > > This approach was already used prior to `9a041a7522`, but we changed it > to always generate the __llvm_profile_runtime due to a TAPI limitation. > Since TAPI is only used on Mach-O platforms, we could use the early > emission of __llvm_profile_runtime there, and on other platforms we > could change back to the earlier approach where the symbol is generated > later only when needed. We can stop passing -u__llvm_profile_runtime to > the linker on Linux and Fuchsia since the generated undefined symbol in > each translation unit that needed it serves the same purpose. > > Differential Revision: https://reviews.llvm.org/D98061 This reverts commit `87fd09b25f`.	2021-03-12 13:53:46 +01:00
Serguei Katkov	cfe8f8e0f0	Revert "Mark gc.relocate and gc.result as readnone" As readnone function they become movable and LICM can hoist them out of a loop. As a result in LCSSA form phi node of type token is created. No one is ready that GCRelocate first operand is phi node but expects to be token. GVN test were also updated, it seems it does not do what is expected. Test for LICM is also added. This reverts commit `f352463ade`.	2021-03-12 16:59:17 +07:00
Johannes Doerfert	0fe0d114e4	Revert "[OpenMP] Introduce the `disable_selector_propagation` variant selector trait" Need to revert `ad9e98b8ef` which this commit depends on. This reverts commit f771ef7b5f0ed260d00931cd50e6fe462edbacaf.	2021-03-11 23:48:35 -06:00
Johannes Doerfert	b2642456ab	[OpenMP] Introduce the `disable_selector_propagation` variant selector trait Nested `omp [begin\|end] declare variant` inherit the selectors from surrounding `omp (begin\|end) declare variant` constructs. To stop such propagation the user can add the `disable_selector_propagation` to the `extension` set in the `implementation` selector. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95765	2021-03-11 23:31:25 -06:00
Carl Ritson	c07f2025e4	[AMDGPU] Restrict image_msaa_load to MSAA dimension types This instruction is only valid on 2D MSAA and 2D MSAA Array surfaces. Remove intrinsic support for other dimension types, and block assembly for unsupported dimensions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D98397	2021-03-12 09:47:24 +09:00
Florian Hahn	c92ec0dd92	[Matrix] Add support for matrix-by-scalar division. This patch extends the matrix spec to allow matrix-by-scalar division. Originally support for `/` was left out to avoid ambiguity for the matrix-matrix version of `/`, which could either be elementwise or specified as matrix multiplication M1 * (1/M2). For the matrix-scalar version, no ambiguity exists; `*` is also an elementwise operation in that case. Matrix-by-scalar division is commonly supported by systems including Matlab, Mathematica or NumPy. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97857	2021-03-11 22:21:23 +00:00
Wenlei He	051f2c144e	[SamplePGO] Skip inlinee profile scaling for sample loader inlining For CGSCC inline, we need to scale down a function's branch weights and entry counts when thee it's inlined at a callsite. This is done through updateCallProfile. Additionally, we also scale the weigths for the inlined clone based on call site count in updateCallerBFI. Neither is needed for inlining during sample profile loader as it's using context profile that is separated from inlinee's own profile. This change skip the inlinee profile scaling for sample loader inlining. Differential Revision: https://reviews.llvm.org/D98187	2021-03-11 10:18:26 -08:00
David Green	fad70c3068	[ARM] Improve WLS lowering Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over to the WLS while loops, improving the chance of lowering them successfully. To do this the lowering has to change a little as the instructions are terminators that produce a value - something that needs to be treated carefully. Lowering starts at the Hardware Loop pass, inserting a new llvm.test.start.loop.iterations that produces both an i1 to control the loop entry and an i32 similar to the llvm.start.loop.iterations intrinsic added for do loops. This feeds into the loop phi, properly gluing the values together: %wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div) %wls0 = extractvalue { i32, i1 } %wls, 0 %wls1 = extractvalue { i32, i1 } %wls, 1 br i1 %wls1, label %loop.ph, label %loop.exit ... loop: %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ] .. %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1) %cmp = icmp ne i32 %iv.next, 0 br i1 %cmp, label %loop, label %loop.exit The llvm.test.start.loop.iterations need to be lowered through ISel lowering as a pair of WLS and WLSSETUP nodes, which each get converted to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent t2WhileLoopStart from being a terminator that produces a value, something difficult to control at that stage in the pipeline. Instead the t2WhileLoopSetup produces the value of LR (essentially acting as a lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc). These are then converted into a single t2WhileLoopStartLR at the same point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop to prevent them from progressing further in the pipeline. The t2WhileLoopStartLR is a single instruction that takes a GPR and produces LR, similar to the WLS instruction. %1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3 t2B %bb.1 ... bb.2.loop: %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2 ... %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2 t2B %bb.3 The t2WhileLoopStartLR can then be treated similar to the other low overhead loop pseudos, eventually being lowered to a WLS providing the branches are within range. Differential Revision: https://reviews.llvm.org/D97729	2021-03-11 17:56:19 +00:00
Hiroshi Yamauchi	365b225d46	[PGO] Fix two issues in PGOMemOPSizeOpt. 1. PGOMemOPSizeOpt grabs only the first, up to five (by default) entries from the value profile metadata and preserves the remaining entries for the fallback memop call site. If there are more than five entries, the rest of the entries would get dropped. This is fine for PGOMemOPSizeOpt itself as it only promotes up to 3 (by default) values, but potentially not for other downstream passes that may use the value profile metadata. 2. PGOMemOPSizeOpt originally assumed that only values 0 through 8 are kept track of. When the range buckets were introduced, it was changed to skip the range buckets, but since it does not grab all entries (only five), if some range buckets exist in the first five entries, it could potentially cause fewer promotion opportunities (eg. if 4 out of 5 were range buckets, it may be able to promote up to one non-range bucket, as opposed to 3.) Also, combined with 1, it means that wrong entries may be preserved, as it didn't correctly keep track of which were entries were skipped. To fix this, PGOMemOPSizeOpt now grabs all the entries (up to the maximum number of value profile buckets), keeps track of which entries were skipped, and preserves all the remaining entries. Differential Revision: https://reviews.llvm.org/D97592	2021-03-11 09:53:05 -08:00
Nikita Popov	6312c53870	[IRBuilder] Deprecate CreateLoad APIs with implicit type These APIs are not compatible with opaque pointers. Deprecate them to avoid the introduction of further uses.	2021-03-11 18:46:45 +01:00
Craig Topper	e9426dfbae	[ValueTypes][RISCV] Add MVT for v1f16. RISCV makes all fixed vector MVTs with size less than or equal to a command line option legal. This didn't include v1f16 because it was missing but did include v1f32 and v1f64. One test is affected where we did test this type, but it is a horizontal reduction so it is non-sensical. Perhaps we should canonicalize that away somewhere. I'm not sure if we should be making v1 types legal, but this will at least make RISCV consistent across all types. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98365	2021-03-11 09:23:18 -08:00
Joseph Huber	807466ef28	[OpenMP] Restore backwards compatibility for libomptarget Summary: The changes introduced in D87946 changed the API for libomptarget functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x but was not given a backward-compatible interface. This change will require people using Clang 13.x or 12.x to recompile their offloading programs. Reviewed By: jdoerfert cchen Differential Revision: https://reviews.llvm.org/D98358	2021-03-11 09:52:11 -05:00
Stephen Tozer	f40976bd01	Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reverts commit `c0f3dfb9f1`. Reverted due to an error on the clang-x64-windows-msvc buildbot.	2021-03-11 14:48:01 +00:00
Simon Pilgrim	cc48b45d24	[llvm-mca] Fix uninitialized variable in InOrderIssueStage constructor warning. NFCI.	2021-03-11 14:41:20 +00:00
Simon Pilgrim	9a259f4386	[Transforms] SampleProfileLoaderBaseImpl<BT>::getFunctionLoc - fix Wdocumentation warnings. NFCI.	2021-03-11 14:04:08 +00:00
gbtozers	c0f3dfb9f1	[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions. These salvages produce DIArgList locations and are only valid for dbg.values, as currently variadic DIExpressions must use DW_OP_stack_value. This functionality is also only added for salvageDebugInfoForDbgValues; other functions that directly call salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be updated in a later patch. Differential Revision: https://reviews.llvm.org/D91722	2021-03-11 13:33:49 +00:00
Serguei Katkov	0480927712	[Statepoint Lowering] Handle the case with several gc.result Recently gc.result has been marked with readnone instead of readonly and this opens a door for different optimization to duplicate gc.result. Statepoint lowering is not ready to see several gc.results. The problem appears when there are gc.results with one located in the same basic block and another located in other basic block. In this case we need both export VR and fill local setValue. Note that this case is not sufficient optimization done before CodeGen. It is evident that local gc.result dominates all other gc.results and it is handled by GVN and EarlyCSE. But anyway, even if IR is not optimal Backend should not crash on a valid IR. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98393	2021-03-11 18:44:44 +07:00
Simon Pilgrim	e74d626925	[IPO] Fix EXPENSIVE_CHECKS assert added at D83744. NFCI. It wasn't taking into account that QueryingAA was a pointer.	2021-03-11 10:29:15 +00:00
Nikita Popov	403da6a69a	Reapply [LICM] Make promotion faster Relative to the previous implementation, this always uses aliasesUnknownInst() instead of aliasesPointer() to correctly handle atomics. The added test case was previously miscompiled. ----- Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite the existing access count limit. This patch optimizes the identification of promotable stores. The idea here is pretty simple: We're only interested in must-alias mod sets of loop invariant pointers. As such, only populate the AST with loop-invariant loads and stores (anything else is definitely not promotable) and then discard any sets which alias with any of the remaining, definitely non-promotable accesses. If we promoted something, check whether this has made some other accesses loop invariant and thus possible promotion candidates. This is much faster in practice, because we need to perform AA queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable) instead of O(NumTotal^2), and NumPromotable tends to be small. Additionally, promotable accesses have loop invariant pointers, for which AA is cheaper. This has a signicant positive compile-time impact. We save ~1.8% geomean on CTMark at O3, with 6% on lencod in particular and 25% on individual files. Conceptually, this change is NFC, but may not be so in practice, because the AST is only an approximation, and can produce different results depending on the order in which accesses are added. However, there is at least no impact on the number of promotions (licm.NumPromoted) in test-suite O3 configuration with this change. Differential Revision: https://reviews.llvm.org/D89264	2021-03-11 10:50:28 +01:00
Djordje Todorovic	9f41c03f82	[Debugify][OriginalDIMode] Export the report into JSON file By using the original-di check with debugify in the combination with the llvm/utils/llvm-original-di-preservation.py it becomes very user friendly tool. An example of the HTML page with the issues related to debug info can be found at [0]. [0] https://djolertrk.github.io/di-checker-html-report-example/ Differential Revision: https://reviews.llvm.org/D82546	2021-03-11 01:11:13 -08:00
Petr Hosek	c7712087cb	[InstrProfiling] Don't generate __llvm_profile_runtime_user This is no longer needed, we can add __llvm_profile_runtime directly to llvm.compiler.used or llvm.used to achieve the same effect. Differential Revision: https://reviews.llvm.org/D98325	2021-03-10 22:33:51 -08:00
Reid Kleckner	b69db4a7ab	Re-land "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit `bacf9cf2c5` and reinstates commit `1a9bd5b813`. Reverting this commit did not appear to make the problem go away, so we can go ahead and reland it.	2021-03-10 15:14:09 -08:00
kuterd	d75c9e61a5	[Attributor] Attributor call site specific AAValueConstantRange This patch makes uses of the context bridges introduced in D83299 to make AAValueConstantRange call site specific. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83744	2021-03-11 01:19:44 +03:00
Quentin Colombet	49942c6d4a	[NFC] Fix a compiler warning Fix a warning caused by -Wrange-loop-analysis Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98297	2021-03-10 13:28:53 -08:00
Alexey Lapshin	4f16e177e1	[llvm-objcopy][NFC] replace class Buffer/MemBuffer/FileBuffer with streams. During D88827 it was requested to remove the local implementation of Memory/File Buffers: // TODO: refactor the buffer classes in LLVM to enable us to use them here // directly. This patch uses raw_ostream instead of Buffers. Generally, using streams could allow us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer. Thus, this patch uses raw_ostream as an interface for output data: Error executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out); Note 1. This patch does not change the implementation of Writers so that data would be directly stored into raw_ostream. This is assumed to be done later. Note 2. It would be better if Writers would be implemented in a such way that data could be streamed without seeking/updating. If that would be inconvenient then raw_ostream could be replaced with raw_pwrite_stream to have a possibility to seek back and update file headers. This is assumed to be done later if necessary. Note 3. Current FileOutputBuffer allows using a memory-mapped file. The raw_fd_ostream (which could be used if data should be stored in the file) does not allow us to use a memory-mapped file. Memory map functionality could be implemented for raw_fd_ostream: It is possible to add resize() method into raw_ostream. class raw_ostream { void resize(uint64_t size); } That method, implemented for raw_fd_ostream, could create a memory-mapped file. The streamed data would be written into that memory file then. Thus we would be able to use memory-mapped files with raw_fd_ostream. This is assumed to be done later if necessary. Differential Revision: https://reviews.llvm.org/D91028	2021-03-10 23:50:04 +03:00
Sriraman Tallam	0ba1ebcbb7	Remove original implementation of UniqueInternalLinkageNames pass. D96109 was recently submitted which contains the refactored implementation of -funique-internal-linakge-names by adding the unique suffixes in clang rather than as an LLVM pass. Deleting the former implementation in this change. Differential Revision: https://reviews.llvm.org/D98234	2021-03-10 11:57:40 -08:00
Quentin Colombet	66dab2fa84	[NFC] Fix compiler warnings Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298	2021-03-10 11:03:50 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	0c73a506e8	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Ta-Wei Tu	7ff2768be1	Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`" This reverts commit `df9158c9a4`.	2021-03-11 01:24:43 +08:00
Stephen Tozer	1db137b185	[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578	2021-03-10 17:15:24 +00:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Christudasan Devadasan	4c6ab48fb1	GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM It is good to have a combined `divrem` instruction when the `div` and `rem` are computed from identical input operands. Some targets can lower them through a single expansion that computes both division and remainder. It effectively reduces the number of instructions than individually expanding them. Reviewed By: arsenm, paquette Differential Revision: https://reviews.llvm.org/D96013	2021-03-10 18:46:07 +05:30
Alex Richardson	35bf23e965	Avoid shuffle self-assignment in EXPENSIVE_CHECKS builds Some versions of libstdc++ perform self-assignment in std::shuffle. This breaks the EXPENSIVE_CHECKS builds of TableGen due to an incorrect assertion in libstdc++. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85828. Fixes https://llvm.org/PR37652 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98167	2021-03-10 11:17:34 +00:00
Vladislav Vinogradov	dc8446c2a0	[ADT][NFC] Use `size_t` type for index in `indexed_accessor_range` It makes it consistent with `size()` method return type and with STL-like containers API. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D97921	2021-03-10 11:40:59 +03:00
Wei Mi	ee35784a90	[SampleFDO] Support enabling -funique-internal-linkage-name. now -funique-internal-linkage-name flag is available, and we want to flip it on by default since it is beneficial to have separate sample profiles for different internal symbols with the same name. As a preparation, we want to avoid regression caused by the flip. When we flip -funique-internal-linkage-name on, the profile is collected from binary built without -funique-internal-linkage-name so it has no uniq suffix, but the IR in the optimized build contains the suffix. This kind of mismatch may introduce transient regression. To avoid such mismatch, we introduce a NameTable section flag indicating whether there is any name in the profile containing uniq suffix. Compiler will decide whether to keep uniq suffix during name canonicalization depending on the NameTable section flag. The flag is only available for extbinary format. For other formats, by default compiler will keep uniq suffix so they will only experience transient regression when -funique-internal-linkage-name is just flipped. Another type of regression is caused by places where we miss to call getCanonicalFnName. Those places are fixed. Differential Revision: https://reviews.llvm.org/D96932	2021-03-09 21:41:40 -08:00
Amara Emerson	55e760769b	[GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted. If every element is extracted from a G_BUILD_VECTOR, pass through the source registers. This is different to the extract(build_vector) combine because this one tolerates multiple users as long as they're exhaustive. Differential Revision: https://reviews.llvm.org/D97890	2021-03-09 11:34:26 -08:00
Amara Emerson	e60ab72137	[AArch64][GlobalISel] Add combine for extract_vector_elt(build_vector, cst) Differential Revision: https://reviews.llvm.org/D97835	2021-03-09 11:08:02 -08:00
Fangrui Song	42e3f97a9d	[MC] Change ELFOSABI_NONE to ELFOSABI_GNU for SHF_GNU_RETAIN GNU ld does not give SHF_GNU_RETAIN GC root semantics for ELFOSABI_NONE. (https://sourceware.org/pipermail/binutils/2021-March/115581.html) This allows GNU ld to interpret SHF_GNU_RETAIN and avoids a gold quirk https://sourceware.org/bugzilla/show_bug.cgi?id=27490 Because ELFObjectWriter is in an anonymous namespace, I have to place `markGnuAbi` in the parent MCObjectWriter. Differential Revision: https://reviews.llvm.org/D97976	2021-03-09 09:59:47 -08:00
Nikita Popov	55ae279ba7	[FastISel] Don't trivially kill extractvalues (PR49467) All extractvalues of the same value at the same index will map to the same register, so even if one specific extractvalue only has one use, we should not mark it as a trivial kill, as there may be more extractvalues later. Fixes https://bugs.llvm.org/show_bug.cgi?id=49467. Differential Revision: https://reviews.llvm.org/D98145	2021-03-09 18:46:38 +01:00
gbtozers	f0513413c7	[DebugInfo] Add replaceArg function to simplify DBG_VALUE_LIST expressions The LiveDebugValues and LiveDebugVariables implementations for handling DBG_VALUE_LIST instructions can be simplified significantly if they do not have to deal with any duplicated operands, such as a DBG_VALUE_LIST that uses the same register multiple times in its expression. This patch adds a function, replaceArg, that can be used to simplify a DIExpression in the case of duplicated operands. Differential Revision: https://reviews.llvm.org/D83896	2021-03-09 17:41:04 +00:00

... 10 11 12 13 14 ...

45424 Commits