llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Astor	b901b6ab17	Revert "[ms] [llvm-ml] Add support for .radix directive, and accept all radix specifiers" This reverts commit `5dd1b6d612`.	2020-09-23 13:59:34 -04:00
Craig Topper	7a3c643c35	[SLP] Make HorizontalReduction::getOperationData take an Instruction* instead of a Value. NFCI All of the callers already have an Instruction . Many of them from a dyn_cast. Also update the OperationData constructor to use a Instruction& to remove a dyn_cast and make it clear that the pointer is non-null. Differential Revision: https://reviews.llvm.org/D88132	2020-09-23 10:51:03 -07:00
Craig Topper	f21f835ee8	[X86] Improve demanded bits for X86ISD::BEXTR. If the control is constant we can figure out exactly which bits of the input are demanded. Differential Revision: https://reviews.llvm.org/D88072	2020-09-23 10:51:02 -07:00
Eric Astor	aca7105db9	Fix include location (accidentally committed a local variation)	2020-09-23 13:50:25 -04:00
Sanjay Patel	6189a8d9f5	[TTI] add wrapper for matching vector reduction to reduce code duplication; NFC I'm not sure what this means, but the order in which we try the matches makes a difference on at least 1 regression test...	2020-09-23 13:48:57 -04:00
Eric Astor	5dd1b6d612	[ms] [llvm-ml] Add support for .radix directive, and accept all radix specifiers Add support for .radix directive, and radix specifiers [yY] (binary), [oOqQ] (octal), and [tT] (decimal). Also, when lexing MASM integers, require radix specifier; MASM requires that all literals without a radix specifier be treated as in the default radix. (e.g., 0100 = 100) Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D87400	2020-09-23 13:45:58 -04:00
Paul C. Anagnostopoulos	b3931188fd	Enhance TableGen so that backends can produce better error messages. Modify SearchableTableEmitter.cpp to take advantage. Clean up formatting and capitalization issues.	2020-09-23 13:35:32 -04:00
Krzysztof Parzyszek	76e8c1899e	Break long line accidentally left in the previous commit	2020-09-23 12:24:45 -05:00
Fangrui Song	01b9deba76	Revert D87970 "[ThinLTO] Avoid temporaries when loading global decl attachment metadata" This reverts commit `ab1b4810b5`. It caused an issue in llvm::lto::thinBackend for a -fsanitize=cfi build. ``` AbbrevNo is 0 => "Invalid abbrev number" 0 llvm::BitstreamCursor::getAbbrev (this=0x9db4c8, AbbrevID=4) at llvm/include/llvm/Bitstream/BitstreamReader.h:528 1 0x00007f5f777a6eb4 in llvm::BitstreamCursor::readRecord (this=0x9db4c8, AbbrevID=4, Vals=llvm::SmallVector of Size 0, Capacity 64, Blob=0x7ffcd0e26558) at usr/local/google/home/maskray/llvm/llvm/lib/Bitstream/Reader/BitstreamReader.cpp:228 2 0x00007f5f796bf633 in llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata (this=0x9db3a0, ID=188, Placeholders=...) at /usr/local/google/home/mas ray/llvm/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1091 3 0x00007f5f796c2527 in llvm::MetadataLoader::MetadataLoaderImpl::getMetadataFwdRefOrLoad (this=0x9db3a0, ID=188) at llvm lib/Bitcode/Reader/MetadataLoader.cpp:668 4 0x00007f5f796bfff3 in llvm::MetadataLoader::getMetadataFwdRefOrLoad (this=0xd31580, Idx=188) at llvm/lib/Bitcode/Reader MetadataLoader.cpp:2290 5 0x00007f5f79638265 in (anonymous namespace)::BitcodeReader::parseFunctionBody (this=0xd312e0, F=0x9de758) at llvm/lib/B tcode/Reader/BitcodeReader.cpp:3938 6 0x00007f5f79635d32 in (anonymous namespace)::BitcodeReader::materialize (this=0xd312e0, GV=0x9de758) at llvm/lib/Bitcod /Reader/BitcodeReader.cpp:5408 7 0x00007f5f7f8dbe3e in llvm::Module::materialize (this=0x9b92c0, GV=0x9de758) at llvm/lib/IR/Module.cpp:442 8 0x00007f5f7f7f8fbe in llvm::GlobalValue::materialize (this=0x9de758) at llvm/lib/IR/Globals.cpp:50 9 0x00007f5f83b9b5f5 in llvm::FunctionImporter::importFunctions (this=0x7ffcd0e2a730, DestModule=..., ImportList=...) at llvm/lib/Transforms/IPO/FunctionImport.cpp:1182 ```	2020-09-23 10:24:08 -07:00
Krzysztof Parzyszek	e976fb1e54	[EarlyCSE] Fix crash with expensive checks after D87691 D87691 reordered some checks, which turned out to be unsafe. More specifically, when examining a store instruction, the check against getOrCreateResult should be done before attempting to call isSameMemGeneration. Otherwise a crash in MSSA walker can occur. This patch restores the order of these calls to what it was originally.	2020-09-23 12:21:34 -05:00
Vinicius Tinti	577adda54f	[Support/Path] Add path::is_absolute_gnu Implements IS_ABSOLUTE_PATH from GNU tools. C++17 is_absolute behavior is different the from the behavior defined by GNU tools. According to cppreference.com, C++17 states: "An absolute path is a path that unambiguously identifies the location of a file without reference to an additional starting location." In other words, the rules are: 1. POSIX style paths with nonempty root directory are absolute. 2. Windows style paths with nonempty root name and root directory are absolute. 3. No other paths are absolute. GNU rules are: 1. Paths starting with a path separator are absolute. 2. Windows style paths are also absolute if they start with a character followed by ':'. 3. No other paths are absolute. On Windows style the path "C:\Users\Default" has "C:" as root name and "\" as root directory. Hence "C:" on Windows is absolute under GNU rules and not absolute under C++17 because it has no root directory. Likewise "/" and "\" on Windows are absolute under GNU and are not absolute under C++17 due to empty root name. Related to PR46368. Differential Revision: https://reviews.llvm.org/D87667	2020-09-23 18:01:32 +01:00
Guozhi Wei	fd75ad8662	[MBFIWrapper] Add a new function getBlockProfileCount MBFIWrapper keeps track of block frequencies of newly created blocks and modified blocks, modified block frequencies should also impact block profile count. This class doesn't provide interface getBlockProfileCount, users can only use the underlying MBFI to query profile count, the underlying MBFI doesn't know the modifications made in MBFIWrapper, so it either provides stale profile count for modified block or simply crashes on new blocks. So this patch add function getBlockProfileCount to class MBFIWrapper to handle new blocks or modified blocks. Differential Revision: https://reviews.llvm.org/D87802	2020-09-23 09:31:45 -07:00
Andrew Wei	c2deacd929	[AArch64] Fix ldst optimization of non-immediate store offset When matching store instruction for ldst opt, we should make sure store instr is in 'reg+imm' form as load instr, otherwise, it will have assertion in isLdOffsetInRangeOfSt since it will use getImm() directly. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87905	2020-09-23 23:00:13 +08:00
Simon Pilgrim	91589cf679	Add missing namespace closure comments. NFCI. Fixes some clang-tidy llvm-namespace-comment warnings.	2020-09-23 16:19:25 +01:00
Simon Pilgrim	474dc33d07	Add missing namespace closure comment. NFCI. Fixes clang-tidy llvm-namespace-comment warning.	2020-09-23 16:19:25 +01:00
Sebastian Neubauer	a343b9b032	Revert "[AMDGPU] Insert waitcnt after returning from call" This reverts commit `ca907bfb57`. According to michel.daenzer, > This completely broke the Mesa radeonsi driver on Navi 14. Xorg + > xterm come up with major corruption & psychedelic colours.	2020-09-23 17:16:39 +02:00
Cameron McInally	db40a74344	[SVE] Lower fixed length ISD::VECREDUCE_ADD to Scalable Differential Revision: https://reviews.llvm.org/D87796	2020-09-23 09:08:07 -05:00
Florian Hahn	31923f6b36	[VPlan] Disconnect VPValue and VPUser. This refactors VPuser to not inherit from VPValue to facilitate introducing operations that introduce multiple VPValues (e.g. VPInterleaveRecipe). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D84679	2020-09-23 14:44:31 +01:00
Jonas Paulsson	370a8c8025	[SystemZ] Make sure not to call getZExtValue on a >64 bit constant. Better use isZero() and isIntN() in SystemZTargetTransformInfo rather than calling getZExtValue() since the immediate operand may be wider than 64 bits, which is not allowed with getZExtValue(). Fixes https://bugs.llvm.org/show_bug.cgi?id=47600 Review: Simon Pilgrim	2020-09-23 15:36:32 +02:00
Matt Arsenault	c463fd136e	GlobalISel: Fix truncating shift amount in trunc (shl) combine The shift amount type does not necessarily match the result type. This was inserting a trunc from s32 to s32, which asserted. Just preserve the original shift amount type which can be legalized later.	2020-09-23 09:07:50 -04:00
Matt Arsenault	af0207f2ba	AMDGPU: Check global FP atomics match default FP mode We would always select global FP atomics from atomicrmw fadd, although they have a hardcoded FP mode.	2020-09-23 09:07:50 -04:00
Kerry McLaughlin	d0149ba9b4	[SVE][CodeGen] Lower legal integer -> floating point conversions This patch adds new ISD nodes, SCVTZ_MERGE_PASSTHRU & UCVTZ_MERGE_PASSTHRU, which are used to lower both legal scalable vector [S\|U]INT_TO_FP operations and the following intrinsics: - llvm.aarch64.sve.scvtf - llvm.aarch64.sve.ucvtf Reviewed By: sdesmalen, efriedma Differential Revision: https://reviews.llvm.org/D87913	2020-09-23 11:53:53 +01:00
Sebastian Neubauer	ca907bfb57	[AMDGPU] Insert waitcnt after returning from call When memory operations are outstanding on function calls, either the caller or the callee can insert a waitcnt to ensure that all reads are finished. Calls need some time to be executed, so if the callee inserts the waitcnt, filling the instruction buffer and waiting for memory will be interleaved, hiding some latency. This comes at the cost of having a waitcnt inside functions that may not be needed as no memory operations are outstanding. For function calls, this is already implemented. The same principal applies to returns: If the caller inserts a waitcnt after the call, the callee does not have to wait and the return and memory operation can be run in parallel. This commit implements waiting in the caller after returning from a function call. Differential Revision: https://reviews.llvm.org/D87674	2020-09-23 12:17:59 +02:00
David Sherwood	e077367a28	[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the caller is requesting a scalar size, which cannot be scalable. This patch makes other similar functions requesting a scalar size consistent with that, thereby eliminating more than 1000 implicit TypeSize -> uint64_t casts. Differential revision: https://reviews.llvm.org/D87889	2020-09-23 09:20:08 +01:00
David Sherwood	59c4d5aad0	[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors In this patch I've fixed some warnings that arose from the implicit cast of TypeSize -> uint64_t. I tried writing a variety of different cases to show how this optimisation might work for scalable vectors and found: 1. The optimisation does not work for cases where the cast type is scalable and the allocated type is not. This because we need to know how many times the cast type fits into the allocated type. 2. If we pass all the various checks for the case when the allocated type is scalable and the cast type is not, then when creating the new alloca we have to take vscale into account. This leads to sub-optimal IR that is worse than the original IR. 3. For the remaining case when both the alloca and cast types are scalable it is hard to find examples where the optimisation would kick in, except for simple bitcasts, because we typically fail the ABI alignment checks. For now I've changed the code to bail out if only one of the alloca and cast types is scalable. This means we continue to support the existing cases where both types are fixed, and also the specific case when both types are scalable with the same size and alignment, for example a simple bitcast of an alloca to another type. I've added tests that show we don't attempt to promote the alloca, except for simple bitcasts: Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll Differential revision: https://reviews.llvm.org/D87378	2020-09-23 08:43:05 +01:00
Piotr Sobczak	8d7fd73c3a	[AMDGPU] Fix merging m0 inits Fix incorrect merges of m0 inits in loops. It was assumed that if a clobbering instruction appears in the same block as an init and the clobbering instruction does not dominate the init then it does not interfere with init. This does not work in the presence of loops, where in this scenario, the clobbering instruction does interfere with the init in another iteration. To fix this, do not check for block equality and defer the decision to the predecessor check. Differential Revision: https://reviews.llvm.org/D87882	2020-09-23 09:13:43 +02:00
Albion Fung	d7eb917a7c	[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10. Differential: https://reviews.llvm.org/D87394#inline-815858	2020-09-23 01:18:14 -05:00
Martin Storsjö	b90132399a	[CVP] Remove a redundant trailing semicolon, fixing GCC warnings. NFC.	2020-09-23 09:03:01 +03:00
Martin Storsjö	2c4c659666	[InstCombine] Add parentheses in assert to silence GCC warning. NFC.	2020-09-23 09:03:01 +03:00
Martin Storsjö	f69e090d7d	[MC] [Win64EH] Try to generate packed unwind info where possible In practice, this only gives modest savings (for a 6.5 MB DLL with 230 KB xdata, the xdata sections shrinks by around 2.5 KB); to gain more, the frame lowering would need to be tweaked to more often generate frame layouts that match the canonical layouts that can be written in packed form. Differential Revision: https://reviews.llvm.org/D87371	2020-09-23 09:03:01 +03:00
Teresa Johnson	ab1b4810b5	[ThinLTO] Avoid temporaries when loading global decl attachment metadata When performing ThinLTO importing, the metadata loader attempts to lazy load, by building an index. However, module level global decl attachment metadata was being parsed early while building the index, since the associated (module level) global values aren't materialized on demand. This results in the creation of forward reference temporary metadatas, which are expensive. Normally, these module level global values don't have much attached metadata. However, in the case of -fwhole-program-vtables (e.g. for whole program devirtualization), the vtables may have many attached type metadatas. This was resulting in very slow performance when performing ThinLTO importing with the default lazy loading. This patch restructures the handling of these global decl attachment records, delaying their parsing until after the lazy loading index has been built. Then the parser can use the interface that loads from the index, which resolves forward references immediately instead of creating expensive temporaries. For one ThinLTO backend that imports from modules containing huge numbers of vtables and associated types, I measured the following compile times for the metadata materialization during function importing, rounded to nearest second: No -fwhole-program-vtables: Lazy loading on (head): 1s Lazy loading off (head): 3s Lazy loading on (patch): 1s With -fwhole-program-vtables: Lazy loading on (head): 440s Lazy loading off (head): 4s Lazy loading on (patch): 2s Differential Revision: https://reviews.llvm.org/D87970	2020-09-22 20:32:07 -07:00
Bing1 Yu	ec24e50553	[CostModel][X86] add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8) add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D87884	2020-09-23 10:29:10 +08:00
Andrew Litteken	88bc59c300	Revert "[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData." This reverts commit `4944bb190f`.	2020-09-22 21:02:34 -05:00
Fangrui Song	bee68b2956	[EHStreamer] Ensure CallSiteEntry::{BeginLabel,EndLabel} are non-null. NFC ... to simplify the code a bit. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D87999	2020-09-22 17:34:43 -07:00
Andrew Litteken	4944bb190f	[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData. The IRSimilarityCandidate is a container to hold a region of IRInstructions and offer interfaces for the starting instruction, ending instruction, parent function, length. It also assigns a global value number for each unique instance of a value in the region. It also contains an interface to compare two IRSimilarity as to whether they have the same sequence of similar instructions. Tests for whether the instructions are similar are found in unittests/Analysis/IRSimilarityIdentifierTest.cpp. Differential Revision: https://reviews.llvm.org/D86970	2020-09-22 18:42:31 -05:00
Hubert Tong	32c9991dab	[InstCombine] Fix errno bug in pow expansion to sqrt A conversion from `pow` to `sqrt` shall not call an `errno`-setting `sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow` call need not. This patch avoids the erroneous (pun not intended) transformation by applying the restrictions discussed in the thread for https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html. The existing tests are updated (depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage): - to add `ninf`, retaining the intended library call, - to use the intrinsic, retaining the use of `select`, or - to expect the replacement to not occur. The following is tested: - The pow intrinsic folds to a `select` instruction to handle -//infinity//. - The pow library call folds, with `ninf`, to `sqrt` without the `select` instruction associated with handling -//infinity//. - The pow library call does not fold to `sqrt` without `ninf`. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87877	2020-09-22 18:58:05 -04:00
Alexey Bataev	d6ac649ccd	[SLP]Fix coding style, NFC.	2020-09-22 17:44:29 -04:00
Philip Reames	e1a3271ebb	[AArch64] Teach analyzeBranch to remove branch equivelent to fallthrough The motivation here is that MachineBlockPlacement relies on analyzeBranch to remove branches to fallthrough blocks when the branch is not fully analyzeable. With the introduction of the FAULTING_OP psuedo for implicit null checking (see D87861), this case becomes important. Note that it's hard to otherwise exercise this path as BranchFolding handle's any fully analyzeable branch sequence without using this interface. p.s. For anyone who saw my comment in the original review, what I thought was an issue in BranchFolding originally turned out to simply be a bug in my patch. (Now fixed.) Differential Revision: https://reviews.llvm.org/D88035	2020-09-22 14:38:27 -07:00
Fangrui Song	49f2744931	Change LoopInfo::empty to isInnermost after D82895	2020-09-22 14:07:40 -07:00
Stefanos Baziotis	a7873e5abc	Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"	2020-09-22 23:59:34 +03:00
Reid Kleckner	90242caca2	Revert "[CodeGen] emit CG profile for COFF object file" This reverts commit `91aed9bf97`, it is causing link errors.	2020-09-22 13:47:39 -07:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Congzhe Cao	4edb3d3646	[AArch64] Avoid pairing loads with same result reg When pairing ldr instructions to an ldp instruction, we cannot pair two ldr destination registers where one is a sub or super register of the other. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D86906	2020-09-22 16:25:08 -04:00
Mircea Trofin	cf112382dd	[ThinLTO] Option to bypass function importing. This completes the circle, complementing -lto-embed-bitcode (specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips function importing. The index file is still needed for the other data it contains. Differential Revision: https://reviews.llvm.org/D87949	2020-09-22 13:12:11 -07:00
Paul C. Anagnostopoulos	21f5f509c8	Two patches to fix the broken build. One to fix a C++ compiler warning. One to allow Sphinx to find a new document.	2020-09-22 16:00:31 -04:00
Roman Lebedev	b289dc5306	[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands This is practically identical to what we already do for UDiv/URem: https://rise4fun.com/Alive/04K Name: narrow udiv Pre: C0 u<= 255 && C1 u<= 255 %r = udiv i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = udiv i8 %t0, %t1 %r = zext i8 %t2 to i16 Name: narrow exact udiv Pre: C0 u<= 255 && C1 u<= 255 %r = udiv exact i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = udiv exact i8 %t0, %t1 %r = zext i8 %t2 to i16 Name: narrow urem Pre: C0 u<= 255 && C1 u<= 255 %r = urem i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = urem i8 %t0, %t1 %r = zext i8 %t2 to i16 ... only here we need to look for 'min signed bits', not 'active bits', and there's an UB to be aware of: https://rise4fun.com/Alive/KG86 https://rise4fun.com/Alive/LwR Name: narrow sdiv Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 %r = sdiv i16 C0, C1 => %t0 = trunc i16 C0 to i9 %t1 = trunc i16 C1 to i9 %t2 = sdiv i9 %t0, %t1 %r = sext i9 %t2 to i16 Name: narrow exact sdiv Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 %r = sdiv exact i16 C0, C1 => %t0 = trunc i16 C0 to i9 %t1 = trunc i16 C1 to i9 %t2 = sdiv exact i9 %t0, %t1 %r = sext i9 %t2 to i16 Name: narrow srem Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 %r = srem i16 C0, C1 => %t0 = trunc i16 C0 to i9 %t1 = trunc i16 C1 to i9 %t2 = srem i9 %t0, %t1 %r = sext i9 %t2 to i16 Name: narrow sdiv Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1) %r = sdiv i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = sdiv i8 %t0, %t1 %r = sext i8 %t2 to i16 Name: narrow exact sdiv Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1) %r = sdiv exact i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = sdiv exact i8 %t0, %t1 %r = sext i8 %t2 to i16 Name: narrow srem Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1) %r = srem i16 C0, C1 => %t0 = trunc i16 C0 to i8 %t1 = trunc i16 C1 to i8 %t2 = srem i8 %t0, %t1 %r = sext i8 %t2 to i16 The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks the logic, that we can losslessly truncate ConstantRange to `getMinSignedBits()` and signext it back, and it will be identical to the original CR. On vanilla llvm test-suite + RawSpeed, this fires 1262 times, while the same fold for UDiv/URem only fires 384 times. Sic! Additionally, this causes +606.18% (+1079) extra cases of aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145) of aggressive-instcombine.NumInstrsReduced folds.	2020-09-22 21:37:30 +03:00
Roman Lebedev	4977eadee5	[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms	2020-09-22 21:37:30 +03:00
Roman Lebedev	7465da2077	[ConstantRange] Introduce getMinSignedBits() method Similar to the ConstantRange::getActiveBits(), and to similarly-named methods in APInt, returns the bitwidth needed to represent the given signed constant range	2020-09-22 21:37:30 +03:00
Roman Lebedev	ba5afe5588	[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits() As an exhaustive test shows, this logic is fully identical to the old implementation, with exception of the case where both of the operands had empty ranges: ``` TEST_F(ConstantRangeTest, CVP_UDiv) { unsigned Bits = 4; EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) { if(CR0.isEmptySet()) return; EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) { if(CR0.isEmptySet()) return; unsigned MaxActiveBits = 0; for (const ConstantRange &CR : {CR0, CR1}) MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits()); ConstantRange OperandRange(Bits, /isFullSet=/false); for (const ConstantRange &CR : {CR0, CR1}) OperandRange = OperandRange.unionWith(CR); unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits(); EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1; }); }); } ```	2020-09-22 21:37:29 +03:00
Roman Lebedev	2ed9c4c70b	[ConstantRange] Introduce getActiveBits() method Much like APInt::getActiveBits(), computes how many bits are needed to be able to represent every value in this constant range, treating the values as unsigned.	2020-09-22 21:37:29 +03:00

1 2 3 4 5 ...

139349 Commits