llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Moll	d05ddb86f6	[VP] vp.sitofp cast intrinsic and docs Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119922	2022-03-02 10:16:19 +01:00
serge-sans-paille	a494ae43be	Cleanup includes: TransformsUtils Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741	2022-03-01 21:00:07 +01:00
Simon Moll	03e83cc8eb	[VP] vp.fptosi cast intrinsic and docs Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119535	2022-02-15 18:17:19 +01:00
Momchil Velikov	6398903ac8	Extend the `uwtable` attribute with unwind table kind We have the `clang -cc1` command-line option `-funwind-tables=1\|2` and the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or asynchronous unwind tables (2)`. However, this is encoded in LLVM IR by the presence or the absence of the `uwtable` attribute, i.e. we lose the information whether to generate want just some unwind tables or asynchronous unwind tables. Asynchronous unwind tables take more space in the runtime image, I'd estimate something like 80-90% more, as the difference is adding roughly the same number of CFI directives as for prologues, only a bit simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even more, if you consider tail duplication of epilogue blocks. Asynchronous unwind tables could also restrict code generation to having only a finite number of frame pointer adjustments (an example of not having a finite number of `SP` adjustments is on AArch64 when untagging the stack (MTE) in some cases the compiler can modify `SP` in a loop). Having the CFI precise up to an instruction generally also means one cannot bundle together CFI instructions once the prologue is done, they need to be interspersed with ordinary instructions, which means extra `DW_CFA_advance_loc` commands, further increasing the unwind tables size. That is to say, async unwind tables impose a non-negligible overhead, yet for the most common use cases (like C++ exceptions), they are not even needed. This patch extends the `uwtable` attribute with an optional value: - `uwtable` (default to `async`) - `uwtable(sync)`, synchronous unwind tables - `uwtable(async)`, asynchronous (instruction precise) unwind tables Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114543	2022-02-14 14:35:02 +00:00
YASHASVI KHATAVKAR	70fdbf35de	Adding DiBuilder interface for assumed length strings	2022-02-11 14:40:02 -05:00
Paul Robinson	d2495b69f2	[RGT] Exercise both paths through a test BitcastToGEP had an opaque/typed pointer decision point, make sure it exercises both sides. Found by the Rotten Green Tests project.	2022-02-11 10:47:00 -08:00
YASHASVI KHATAVKAR	93d1a623ce	Reverting an entire stack of changes causing build failures	2022-02-10 17:58:22 -05:00
YASHASVI KHATAVKAR	0e7341b7b1	worked on review comments	2022-02-10 15:24:51 -05:00
YASHASVI KHATAVKAR	c26a0d1cda	Updated the test to include proper string get functions	2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR	929499eb64	Updated the test to include addtional details	2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR	99f990be64	Added StringLocationExp to the new apis	2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR	43d421cda3	Adding DIBuilder interface for assumed length string	2022-02-10 15:24:50 -05:00
Craig Topper	60745fb16f	[VP] llvm.vp.fneg intrinsic and LangRef Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119262	2022-02-09 07:54:36 -08:00
Craig Topper	cef177d186	[VP] llvm.vp.fma intrinsic and LangRef Differential Revision: https://reviews.llvm.org/D119185	2022-02-07 15:53:27 -08:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Michael Gottesman	7ed95d1577	[debug-info] Add support for llvm.dbg.addr in DIBuilder. I based this off of the API already create for llvm.dbg.value since both intrinsics have the same arguments at the API level. I added some tests exercising the API a little as well as an additional small test that shows how one can use llvm.dbg.addr to limit the PC range where an address value is available in the debugger. This is done by calling llvm.dbg.value with undef and the same metadata info as one used to create the llvm.dbg.addr. rdar://83957028 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D117442	2022-01-18 18:26:50 -08:00
Simon Moll	33efbc8184	[VP] llvm.vp.merge intrinsic and LangRef llvm.vp.merge interprets the %evl operand differently than the other vp intrinsics: all lanes at positions greater or equal than the %evl operand are passed through from the second vector input. Otherwise it behaves like llvm.vp.select. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116725	2022-01-12 14:06:56 +01:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Nikita Popov	32808cfb24	[IR] Track users of comdats Track all GlobalObjects that reference a given comdat, which allows determining whether a function in a comdat is dead without scanning the whole module. In particular, this makes filterDeadComdatFunctions() have complexity O(#DeadFunctions) rather than O(#SymbolsInModule), which addresses half of the compile-time issue exposed by D115545. Differential Revision: https://reviews.llvm.org/D115864	2022-01-06 09:13:58 +01:00
serge-sans-paille	9290ccc3c1	Introduce the AttributeMask class This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for building and removing, which introduced odd situation like creation of Attribute with dummy value because the only relevant part was the attribute kind. Differential Revision: https://reviews.llvm.org/D116110	2022-01-04 15:37:46 +01:00
Florian Hahn	5d68dc184e	[Verifier] Iteratively traverse all indirect users. The recursive implementation can run into stack overflows, e.g. like in PR52844. The order the users are visited changes, but for the current use case this only impacts the order error messages are emitted.	2021-12-23 23:20:12 +01:00
Fraser Cormack	8002fa6760	[LangRef] Remove incorrect vector alignment rules The LangRef incorrectly says that if no exact match is found when seeking alignment for a vector type, the largest vector type smaller than the sought-after vector type. This is incorrect as vector types require an exact match, else they fall back to reporting the natural alignment. The corrected rule was not added in its place, as rules for other types (e.g., floating-point types) aren't documented. A unit test was added to demonstrate this. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D112463	2021-12-14 14:35:40 +00:00
Nikita Popov	6213f1dd03	[IR] Make VPIntrinsic::getDeclarationForParams() opaque pointer compatible The vp.load and vp.gather intrinsics require the intrinsic return type to determine the correct function signature. With opaque pointers, it cannot be derived from the parameter pointee types. Differential Revision: https://reviews.llvm.org/D115632	2021-12-14 14:20:59 +01:00
Nikita Popov	9cbab13282	[ConstantsTest] Avoid crash with opaque pointers With opaque pointers there will be no bitcast, so don't assume that.	2021-12-13 15:23:12 +01:00
Arthur Eubanks	93a20ecee4	[DebugInfo] Check DIEnumerator bit width when comparing for equality As mentioned in D106585, this causes non-determinism, which can also be shown by this test case being flaky without this patch. We were using the APSInt's bit width for hashing, but not for checking for equality. APInt::isSameValue() does not check bit width. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D115054	2021-12-03 13:40:22 -08:00
Sanjay Patel	97e921c81f	[PatternMatch] create and use matcher for 'not' that excludes undef elements We needed a stricter version of m_Not for D114462, but I wasn't sure if that was going to be required anywhere else, so I didn't bother to make that reusable. It turns out we have one more existing simplification that needs this (currently miscompiles): https://alive2.llvm.org/ce/z/9-nTKi And there's at least one more fold in that family that we could add. Differential Revision: https://reviews.llvm.org/D114882	2021-12-02 08:51:13 -05:00
Nikita Popov	55d392cc30	[llvm-c] Make LLVMAddAlias opaque pointer compatible Deprecate LLVMAddAlias in favor of LLVMAddAlias2, which accepts a value type and an address space. Previously these were extracted from the pointer type. Differential Revision: https://reviews.llvm.org/D114860	2021-12-02 09:21:16 +01:00
Ellis Hoag	0150645bf5	[DebugInfo] Do not replace existing nodes from DICompileUnit When creating a new DIBuilder with an existing DICompileUnit, load the DINodes from the current DICompileUnit so they don't get overwritten. This is done in the MachineOutliner pass, but it didn't change the CU so the bug never appeared. We need this if we ever want to add DINodes to the CU after it has been created, e.g., DIGlobalVariables. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D114556	2021-11-29 19:46:10 -08:00
Simon Moll	1e65b93f3a	[VP] Canonicalize macros of VPIntrinsics.def Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>. Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro). A follow-up patch has documentation on how the macros are (intended) to be used. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D114144	2021-11-23 16:51:11 +01:00
David Sherwood	ca18fcc2c0	[IR] Change CreateStepVector to work with element types smaller than i8 Currently the stepvector intrinsic only supports element types that are integers of size 8 bits or more. This patch adds support for the creation of stepvectors with smaller element types by creating the intrinsic with i8 elements that we then truncate to the requested size. It's not currently possible to write a vectoriser test to exercise this code path so I have added a unit test here: llvm/unittests/IR/IRBuilderTest.cpp Differential Revision: https://reviews.llvm.org/D113767	2021-11-17 10:47:50 +00:00
Mircea Trofin	a32c2c3808	[NFC] Use Optional<ProfileCount> to model invalid counts ProfileCount could model invalid values, but a user had no indication that the getCount method could return bogus data. Optional<ProfileCount> addresses that, because the user must dereference the optional. In addition, the patch removes concept duplication. Differential Revision: https://reviews.llvm.org/D113839	2021-11-14 19:03:30 -08:00
Nikita Popov	1c5d636af1	[ConstantRangeTest] Add helper to enumerate APInts (NFC) While ForeachNumInConstantRange(ConstantRange::getFull(Bits)) works, it's somewhat roundabout, and I keep looking for this function.	2021-11-12 18:18:51 +01:00
Nikita Popov	2060895c9c	[ConstantRange] Add exact union/intersect (NFC) For some optimizations on comparisons it's necessary that the union/intersect is exact and not a superset. Add methods that return Optional<ConstantRange> only if the result is exact. For the sake of simplicity this is implemented by comparing the subset and superset approximations for now, but it should be possible to do this more directly, as unionWith() and intersectWith() already distinguish the cases where the result is imprecise for the preferred range type functionality.	2021-11-07 21:46:06 +01:00
Nikita Popov	cf71a5ea8f	[ConstantRange] Support zero size in isSizeLargerThan() From an API perspective, it does not make a lot of sense that 0 is not a valid argument to this function. Add the exact check needed to support it.	2021-11-07 21:22:45 +01:00
Luke Benes	2249ecee8d	[IR][ShuffleVector] Fix Wdangling-else warning in InstructionsTest Fix a dangling else that gcc-11 warned about. The EXPECT_EQ macro expands to an if-else, so the whole construction contains a hidden dangling else. Differential Revision: https://reviews.llvm.org/D113346	2021-11-07 00:07:01 +03:00
Nikita Popov	9f0194be45	[ConstantRange] Add getEquivalentICmp() variant with offset (NFCI) Add a variant of getEquivalentICmp() that produces an optional offset. This allows us to create an equivalent icmp for all ranges. Use this in the with.overflow folding code, which was doing this adjustment separately -- this clarifies that the fold will indeed always apply.	2021-11-06 21:59:45 +01:00
Roman Lebedev	a5cd27880a	[IR] Improve member `ShuffleVectorInst::isReplicationMask()` When we have an actual shuffle, we can impose the additional restriction that the mask replicates the elements of the first operand, so we know the replication factor as a ratio of output and op0 vector sizes.	2021-11-06 00:09:27 +03:00
Roman Lebedev	0b36431810	[NFCI] InstructionTest: trim `InstructionsTest.ShuffleMaskIsReplicationMask_*` complexity These tests have pretty high O() complexity due to their nature, which leads to potentially-long runtimes. While in release build for me they took ~1 and ~2 sec, as noted in https://reviews.llvm.org/D113214#inline-1080479 they take minutes in debug build. Fine-tune the amount of permutations they deal with, without affecting the test coverage. After this, they take <~10ms each for me (in release build), hopefully that is good-enough for debug build too.	2021-11-05 19:22:48 +03:00
Roman Lebedev	01d8759ac9	[IR][ShuffleVector] Introduce `isReplicationMask()` matcher Avid readers of this saga may recall from previous installments, that replication mask replicates (lol) each of the `VF` elements in a vector `ReplicationFactor` times. For example, the mask for `ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`. More importantly, replication mask is used by LoopVectorizer when using masked interleaved memory operations. As discussed in previous installments, while it is used by LV, and we seem to support masked interleaved memory operations on X86, it's support in cost model leaves a lot to be desired: until basically yesterday even for AVX512 we had no cost model for it. As it has been witnessed in the recent AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()` costmodel patches, while it is hard-enough to query the cost of a particular assembly sequence [from llvm-mca], afterwards the check lines LV costmodel tests must be updated manually. This is, at the very least, boring. Okay, now we have decent costmodel coverage for interleaving shuffles, but now basically the same mind-killing sequence has to be performed for replication mask. I think we can improve at least the second half of the problem, by teaching the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize `Instruction::ShuffleVector` that are repetition masks, adding exhaustive test coverage using `-cost-model -analyze` + `utils/update_analyze_test_checks.py` This way we can have good exhaustive coverage for cost model, and only basic coverage for the LV costmodel. This patch adds precise undef-aware `isReplicationMask()`, with exhaustive test coverage. * `InstructionsTest.ShuffleMaskIsReplicationMask` shows that it correctly detects all the known masks. * `InstructionsTest.ShuffleMaskIsReplicationMask_undef` shows that replacing some mask elements in a known replication mask still allows us to recognize it as a replication mask. Note, with enough undef elts, we may detect a different tuple. * `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness` shows that if we detected the replication mask with given params, then if we actually generate a true replication mask with said params, it matches element-wise ignoring undef mask elements. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D113214	2021-11-05 16:53:47 +03:00
Jakub Kuderski	3348b841d3	Make enum iteration with seq safe by default By default `llvm::seq` would happily iterate over enums, which may be unsafe if the enum values are not continuous. This patch disable enum iteration with `llvm::seq` and `llvm::seq_inclusive` and adds two new functions: `enum_seq` and `enum_seq_inclusive`. To make sure enum iteration is safe, we require users to declare their enum types as iterable by specializing `enum_iteration_traits<SomeEnum>`. Because it's not always possible to add these traits next to enum definition (e.g., for enums defined in external libraries), we provide an escape hatch to allow iteration on per-callsite basis by passing `force_iteration_on_noniterable_enum`. The main benefit of this approach is that these global declarations via traits can appear just next to enum definitions, making easy to spot when enums are miss-labeled, e.g., after introducing new enum values, whereas `force_iteration_on_noniterable_enum` should stand out and be easy to grep for. This emerged from a discussion with gchatelet@ about reusing llvm's `Sequence.h` in lieu of https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/interface/lgc/EnumIterator.h. Reviewed By: dblaikie, gchatelet, aaron.ballman Differential Revision: https://reviews.llvm.org/D107378	2021-11-03 20:52:21 -04:00
Roman Lebedev	03a4f1f3b8	[ConstantRange] Sign-flipping of signedness-invariant comparisons For certain combination of LHS and RHS constant ranges, the signedness of the relational comparison predicate is irrelevant. This implements complete and precise model for all predicates, as confirmed by the brute-force tests. I'm not sure if there are some more cases that we can handle here. In a follow-up, CVP will make use of this. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D90924	2021-10-31 22:53:17 +03:00
Roman Lebedev	25043c8276	[NFCI] Introduce `ICmpInst::compare()` and use it where appropriate As noted in https://reviews.llvm.org/D90924#inline-1076197 apparently this is a pretty common pattern, let's not repeat it yet again, but have it in a common place. There may be some more places where it could be used, but these are the most obvious ones.	2021-10-30 17:50:06 +03:00
Roman Lebedev	42712698fd	Revert "[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`" Clang OpenMP codegen tests are failing. This reverts commit `288f1f8abe`. This reverts commit `cb90e5356a`.	2021-10-27 22:21:37 +03:00
Roman Lebedev	cb90e5356a	[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x` There's precedent for that in `CreateOr()`/`CreateAnd()`. The motivation here is to avoid bloating the run-time check's IR in `SCEVExpander::generateOverflowCheck()`. Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 21:34:38 +03:00
Yuanfang Chen	7c3fa52785	[DebugInfo] Skip ODRUniquing for mismatched tags Otherwise, ODRUniquing would map some member method/variable MDNodes to have enum type DIScope, resulting in invalid debug info and bad DWARF. - Add a Verifier check that when a 'scope:' operand is an ODR type that is not an enum. - Makes ODRUniquing apply to only ODR types with the same tag so that the debuginfo/DWARF is well-formed. Reviewed By: probinson, aprantl Differential Revision: https://reviews.llvm.org/D111770	2021-10-26 15:28:25 -07:00
Jakub Kuderski	763ae1d2c6	[DomTree][NFC] Clean up nits in DomTree code Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D112482	2021-10-25 16:05:34 -04:00
Itay Bookstein	08ed216000	[IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol As discussed in: * https://reviews.llvm.org/D94166 * https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html The GlobalIndirectSymbol class lost most of its meaning in https://reviews.llvm.org/D109792, which disambiguated getBaseObject (now getAliaseeObject) between GlobalIFunc and everything else. In addition, as long as GlobalIFunc is not a GlobalObject and getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee is a GlobalIFunc cannot currently be modeled properly. Creating aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition, calling getAliaseeObject on a GlobalIFunc will currently return nullptr, which is undesirable because it should return the object itself for non-aliases. This patch refactors the GlobalIFunc class to inherit directly from GlobalObject, and removes GlobalIndirectSymbol (while inlining the relevant parts into GlobalAlias and GlobalIFunc). This allows for calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc itself, making getAliaseeObject() more consistent and enabling alias-to-ifunc to be properly modeled in the IR. I exercised some judgement in the API clients of GlobalIndirectSymbol: some were 'monomorphized' for GlobalAlias and GlobalIFunc, and some remained shared (with the type adapted to become GlobalValue). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108872	2021-10-20 10:29:47 -07:00
Nikita Popov	274b2439f8	[ConstantRange] Add fast signed multiply The multiply() implementation is very slow -- it performs six multiplications in double the bitwidth, which means that it will typically work on allocated APInts and bypass fast-path implementations. Add an additional implementation that doesn't try to produce anything better than a full range if overflow is possible. At least for the BasicAA use-case, we really don't care about more precise modeling of overflow behavior. The current use of multiply() is fine while the implementation is limited to a single index, but extending it to the multiple-index case makes the compile-time impact untenable.	2021-10-17 16:41:49 +02:00
Nikita Popov	587493b441	[ConstantRange] Compute precise shl range for single elements For the common case where the shift amount is constant (a single element range) we can easily compute a precise range (up to unsigned envelope), so do that.	2021-10-15 23:44:41 +02:00
Nikita Popov	9eb8040a28	[ConstantRange] Support checking optimality for subset of inputs (NFC) We always want to check correctness, but for some operations we can only guarantee optimality for a subset of inputs. Accept an additional predicate that determines whether optimality for a given pair of ranges should be checked.	2021-10-15 22:48:07 +02:00
Nikita Popov	82e858d1bf	[ConstantRange] Better diagnostic for correctness test failure (NFC) Print a friendly error message including the inputs, result and not-contained element if an exhaustive correctness test fails, same as we do if the optimality test fails.	2021-10-15 21:52:17 +02:00
Sanjay Patel	519752062c	[PatternMatch] add matchers for commutative logical and/or We need these to add folds with the same structure as regular commuted logic ops.	2021-10-07 10:37:34 -04:00
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit `8d64314ffe`.	2021-10-06 11:38:11 -07:00
Arthur Eubanks	8d64314ffe	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 11:03:51 -07:00
Arthur Eubanks	72cf8b6044	Revert "[IR] Increase max alignment to 4GB" This reverts commit `df84c1fe78`. Breaks some bots	2021-10-06 10:21:35 -07:00
Arthur Eubanks	df84c1fe78	[IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 09:54:14 -07:00
Kazu Hirata	3081de8c72	[llvm] Migrate from getNumArgOperands to arg_size (NFC) Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-05 08:29:19 -07:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Min-Yih Hsu	475de8da01	[IR]PATCH 2/2: Add MDNode::printTree and dumpTree This patch adds the functionalities to print MDNode in tree shape. For example, instead of printing a MDNode like this: ``` <0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic \| DIFlagFwdDecl, align: 8) ``` The printTree/dumpTree functions can give you: ``` <0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic \| DIFlagFwdDecl, align: 8) <0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0) <0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir") <0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2) <0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype") ``` Which is useful when using it in debugger. Where sometimes printing the whole module to see all MDNodes is too expensive. Differential Revision: https://reviews.llvm.org/D110113	2021-10-02 21:19:52 -07:00
Kazu Hirata	f631173d80	[llvm] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-09-30 08:51:21 -07:00
Simon Moll	72a08c0b94	[VP] Vector predicated vector splice intrinsic This patch introduces the vector-predicated version of the experimental_vector_splice intrinsic [1] at the IR level. It considers the active vector length for both vectors and and uses a vector mask to disable certain lanes in the result. [1] https://reviews.llvm.org/D94708 Change originally authored by Vineet Kumar <vineet.kumar@bsc.es> Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103898	2021-09-29 10:43:36 +02:00
hyeongyu kim	86bf234d0b	[IR] Change the default value of InstertElement to poison (1/4) This patch is for fixing potential insertElement-related bugs like D93818. ``` V = UndefValue::get(VecTy); for(...) V = Builder.CreateInsertElementy(V, Elt, Idx); => V = PoisonValue::get(VecTy); for(...) V = Builder.CreateInsertElementy(V, Elt, Idx); ``` Like above, this patch changes the placeholder V to poison. The patch will be separated into several commits. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D110311	2021-09-28 22:29:16 +09:00
Anirudh Prasad	e09a1dc475	[SystemZ][z/OS] Add GOFF Support to the DataLayout - This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation. - Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend. Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay Differential Revision: https://reviews.llvm.org/D109362	2021-09-24 14:09:01 -04:00
Alok Kumar Sharma	a5b72abc9e	[DebugInfo] Enhance DIImportedEntity to accept children entities New field `elements` is added to '!DIImportedEntity', representing list of aliased entities. This is needed to dump optimized debugging information where all names in a module are imported, but a few names are imported with overriding aliases. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D109343	2021-09-16 10:41:55 +05:30
Nikita Popov	90ec6dff86	[OpaquePtr] Forbid mixing typed and opaque pointers Currently, opaque pointers are supported in two forms: The -force-opaque-pointers mode, where all pointers are opaque and typed pointers do not exist. And as a simple ptr type that can coexist with typed pointers. This patch removes support for the mixed mode. You either get typed pointers, or you get opaque pointers, but not both. In the (current) default mode, using ptr is forbidden. In -opaque-pointers mode, all pointers are opaque. The motivation here is that the mixed mode introduces additional issues that don't exist in fully opaque mode. D105155 is an example of a design problem. Looking at D109259, it would probably need additional work to support mixed mode (e.g. to generate GEPs for typed base but opaque result). Mixed mode will also end up inserting many casts between i8* and ptr, which would require significant additional work to consistently avoid. I don't think the mixed mode is particularly valuable, as it doesn't align with our end goal. The only thing I've found it to be moderately useful for is adding some opaque pointer tests in between typed pointer tests, but I think we can live without that. Differential Revision: https://reviews.llvm.org/D109290	2021-09-10 15:18:23 +02:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Simon Moll	ea2cdbf5e6	[VP] Declaration and docs for vp.select intrinsic llvm.vp.select extends the regular select instruction with an explicit vector length (%evl). All lanes with indexes at and above %evl are undefined. Lanes below %evl are taken from the first input where the mask is true and from the second input otherwise. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D105351	2021-09-02 11:17:14 +02:00
Yonghong Song	1bebc31c61	[DebugInfo] generate btf_tag annotations for func parameters Generate btf_tag annotations for function parameters. A field "annotations" is introduced to DILocalVariable, and annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DILocalVariable(name: "info",, arg: 1, ..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Differential Revision: https://reviews.llvm.org/D106620	2021-08-26 14:18:30 -07:00
Yonghong Song	30c288489a	[DebugInfo] generate btf_tag annotations for DIGlobalVariable Generate btf_tag annotations for DIGlobalVariable. A field "annotations" is introduced to DIGlobalVariable, and annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DIGlobalVariable(..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Differential Revision: https://reviews.llvm.org/D106619	2021-08-26 10:03:44 -07:00
Roman Lebedev	564d85e090	The maximal representable alignment in LLVM IR is 1GiB, not 512MiB In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide But that means that the maximal alignment exponent is `(1<<5)-2`, which is `30`, not `29`. And indeed, alignment of `1073741824` roundtrips IR serialization-deserialization. While this doesn't seem all that important, this doubles the maximal supported alignment from 512MiB to 1GiB, and there's actually one noticeable use-case for that; On X86, the huge pages can have sizes of 2MiB and 1GiB (!). So while this doesn't add support for truly huge alignments, which i think we can easily-ish do if wanted, i think this adds zero-cost support for a not-trivially-dismissable case. I don't believe we need any upgrade infrastructure, and since we don't explicitly record the IR version, we don't need to bump one either. As @craig.topper speculates in D108661#2963519, this might be an artificial limit imposed by the original implementation of the `getAlignment()` functions. Differential Revision: https://reviews.llvm.org/D108661	2021-08-26 12:53:39 +03:00
Yonghong Song	0b32dca12e	Reland [DebugInfo] generate btf_tag annotations for DIComposite types Clang patch D106614 added attribute btf_tag support. This patch generates btf_tag annotations for DIComposite types. A field "annotations" is introduced to DIComposite, and the annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DICompositeType(..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Each btf_tag annotation is represented as a 2D array of meta strings. Each record may have more than one btf_tag annotations, as in the above example. Reland with additional fixes for llvm/unittests/IR/DebugTypeODRUniquingTest.cpp. Differential Revision: https://reviews.llvm.org/D106615	2021-08-19 17:33:50 -07:00
Arthur Eubanks	7c8206cd2a	[NFC] Cleanup AttributeList::getStackAlignment() So that we don't use a confusing index.	2021-08-19 14:21:40 -07:00
Arthur Eubanks	3f4d00bc3b	[NFC] More get/removeAttribute() cleanup	2021-08-17 21:05:41 -07:00
Arthur Eubanks	de0ae9e89e	[NFC] Cleanup more AttributeList::addAttribute()	2021-08-17 21:05:41 -07:00
Arthur Eubanks	ad727ab7d9	[NFC] Migrate some callers away from Function/AttributeLists methods that take an index These methods can be confusing.	2021-08-17 21:05:40 -07:00
Fraser Cormack	f3e9047249	[VP] Add vector-predicated reduction intrinsics This patch adds vector-predicated ("VP") reduction intrinsics corresponding to each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the unpredicated reductions, all VP reductions have a start value. This start value is returned when the no vector element is active. Support for expansion on targets without native vector-predication support is included. This patch is based on the ["reduction slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch (https://reviews.llvm.org/D57504). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104308	2021-08-17 17:56:35 +01:00
Arthur Eubanks	d7593ebaee	[NFC] Clean up users of AttributeList::hasAttribute() AttributeList::hasAttribute() is confusing, use clearer methods like hasParamAttr()/hasRetAttr(). Add hasRetAttr() since it was missing from AttributeList.	2021-08-13 11:59:18 -07:00
Arthur Eubanks	a9831cce1e	[NFC] Remove public uses of AttributeList::getAttributes() Use methods that better convey the intent.	2021-08-13 11:38:12 -07:00
Arthur Eubanks	80ea2bb574	[NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes() This is more consistent with similar methods.	2021-08-13 11:16:52 -07:00
Arthur Eubanks	92ce6db9ee	[NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr() This is more consistent with similar methods.	2021-08-13 11:09:18 -07:00
Arthur Eubanks	a0c42ca56c	[NFC] Remove AttributeList::hasParamAttribute() It's the same as AttributeList::hasParamAttr().	2021-08-13 10:58:21 -07:00
Paul Robinson	75aa3d520d	Add a DIExpression const-folder to prevent silly expressions. It's entirely possible (because it actually happened) for a bool variable to end up with a 256-bit DW_AT_const_value. This came about when a local bool variable was initialized from a bitfield in a 32-byte struct of bitfields, and after inlining and constant propagation, the variable did have a constant value. The sequence of optimizations had it carrying "i256" values around, but once the constant made it into the llvm.dbg.value, no further IR changes could affect it. Technically the llvm.dbg.value did have a DIExpression to reduce it back down to 8 bits, but the compiler is in no way ready to emit an oversized constant and a DWARF expression to manipulate it. Depending on the circumstances, we had either just the very fat bool value, or an expression with no starting value. The sequence of optimizations that led to this state did seem pretty reasonable, so the solution I came up with was to invent a DWARF constant expression folder. Currently it only does convert ops, but there's no reason it couldn't do other ops if that became useful. This broke three tests that depended on having convert ops survive into the DWARF, so I added an operator that would abort the folder to each of those tests. Differential Revision: https://reviews.llvm.org/D106915	2021-08-05 06:14:40 -07:00
Dylan Fleming	80e0bd1496	[SVE][IR] Fix Binary op matching in PatternMatch::m_VScale Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105978	2021-07-23 11:39:13 +01:00
Guillaume Chatelet	d6da02d952	[llvm] Add enum iteration to Sequence This patch allows iterating typed enum via the ADT/Sequence utility. It also changes the original design to better separate concerns: - `StrongInt` only deals with safe `intmax_t` operations, - `SafeIntIterator` presents the iterator and reverse iterator interface but only deals with safe `StrongInt` internally. - `iota_range` only deals with `SafeIntIterator` internally. This design ensures that operations are always valid. In particular, "Out of bounds" assertions fire when: - the `value_type` is not representable as an `intmax_t` - iterator operations make internal computation underflow/overflow - the internal representation cannot be converted back to `value_type` Differential Revision: https://reviews.llvm.org/D106279	2021-07-21 12:48:53 +00:00
Fraser Cormack	5cbd5c62be	[VP][NFC] Correct formatting in unit test	2021-07-15 12:38:47 +01:00
Guillaume Chatelet	2c47b8847e	Revert "[llvm] Add enum iteration to Sequence" This reverts commit `a006af5d6e`.	2021-07-13 16:44:42 +00:00
Guillaume Chatelet	a006af5d6e	[llvm] Add enum iteration to Sequence This patch allows iterating typed enum via the ADT/Sequence utility. Differential Revision: https://reviews.llvm.org/D103900	2021-07-13 16:22:19 +00:00
Nikita Popov	7ed3e87825	[Attributes] Determine attribute properties from TableGen data Continuing from D105763, this allows placing certain properties about attributes in the TableGen definition. In particular, we store whether an attribute applies to fn/param/ret (or a combination thereof). This information is used by the Verifier, as well as the ForceFunctionAttrs pass. I also plan to use this in LLParser, which also duplicates info on which attributes are valid where. This keeps metadata about attributes in one place, and makes it more likely that it stays in sync, rather than in various functions spread across the codebase. Differential Revision: https://reviews.llvm.org/D105780	2021-07-12 22:13:38 +02:00
Arthur Eubanks	d338e79a4c	[OpaquePtr] Remove checking pointee type for byval/preallocated type These currently always require a type parameter. The bitcode reader already upgrades old bitcode without the type parameter to use the pointee type. In cases where the caller does not have byval but the callee does, we need to follow CallBase::paramHasAttr() and also look at the callee for the byval type so that CallBase::isByValArgument() and CallBase::getParamByValType() are in sync. Do the same for preallocated. While we're here add a corresponding version for inalloca since we'll need it soon. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104663	2021-07-07 14:28:55 -07:00
Hussain Kadhem	d21a35ac0a	[VP] Implementation of intrinsic and SDNode definitions for VP load, store, gather, scatter. This patch adds intrinsic definitions and SDNodes for predicated load/store/gather/scatter, based on the work done in D57504. Reviewed By: simoll, craig.topper Differential Revision: https://reviews.llvm.org/D99355	2021-07-01 13:34:44 +02:00
Fraser Cormack	983972bfb0	[VP][NFCI] Address various clang-tidy warnings Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D104288	2021-06-21 10:57:42 +01:00
Simon Moll	74d45b884c	[VP] Binary floating-point intrinsics. This patch implements vector-predicated intrinsics on IR level for fadd, fsub, fmul, fdiv and frem. There operate in the default floating-point environment. We will use constrained fp operand bundles for constrained vector-predicated fp math (D93455). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93470	2021-06-14 08:51:41 +02:00
Simon Moll	0f9d299122	[VP] getDeclarationForParams `VPIntrinsic::getDeclarationForParams` creates a vp intrinsic declaration for parameters you want to call it with. This is in preparation of a new builder class that makes emitting vp intrinsic code nearly as convenient as using a plain ir builder (aka `VectorBuilder`, to be used by D99750). Reviewed By: frasercrmck, craig.topper, vkmr Differential Revision: https://reviews.llvm.org/D102686	2021-06-08 14:21:28 +02:00
Arthur Eubanks	8961293851	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Craig Topper	2830d924b0	[VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names. Parameter positions seem like they should be unsigned. While there, make function names lowercase per coding standards. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103224	2021-05-28 11:28:47 -07:00
Simon Moll	66963bf381	[VP] make getFunctionalOpcode return an Optional The operation of some VP intrinsics do/will not map to regular instruction opcodes. Returning 'None' seems more intuitive here than 'Instruction::Call'. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102778	2021-05-19 17:08:34 +02:00
Benjamin Kramer	05de4b4139	Put back the trailing commas on TYPED_TEST_SUITE This avoids a -pedantic warning: warning: ISO C++11 requires at least one argument for the "..." in a variadic macro See also https://github.com/google/googletest/issues/2271	2021-05-17 14:14:13 +02:00
Benjamin Kramer	34fa3f8733	Clean up uses of gmock Invoke in an attempt to make it work with GCC 6.2. NFCI.	2021-05-17 13:48:45 +02:00

1 2 3 4 5 ...

1031 Commits