llvm-project

Commit Graph

Author	SHA1	Message	Date
David Sherwood	57ca65e21e	[AArch64] Add instruction costs for FP_TO_UINT and FP_TO_SINT with half types We were missing some instruction costs when converting vectors of floating point half types into integers, so I've added those here. I also manually generated assembly code for each FP->int case and looked at the number of instructions generated, which meant adjusting some of the existing costs too. I've updated an existing test to reflect the new costs: Analysis/CostModel/AArch64/sve-fptoi.ll Differential Revision: https://reviews.llvm.org/D99935	2021-04-21 09:39:45 +01:00
Yang Fan	4307446e9f	[SCEV] Fix -Wunused-variable warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘const llvm::SCEV* llvm::ScalarEvolution::getLosslessPtrToIntExpr(const llvm::SCEV, unsigned int)::SCEVPtrToIntSinkingRewriter::visitUnknown(const llvm::SCEVUnknown)’: /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:1152:13: warning: unused variable ‘ExprPtrTy’ [-Wunused-variable] 1152 \| Type *ExprPtrTy = Expr->getType(); \| ^~~~~~~~~ ```	2021-04-21 16:01:46 +08:00
Nikita Popov	de18fa9e52	Revert "[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543)" This reverts commit `ea1a0d7c9a`. While this is strictly more powerful, it is also strictly slower. InstSimplify intentionally does not perform many folds that it is allowed to perform, if doing so requires a KnownBits calculation that will be repeated in InstCombine. Maybe it's worthwhile to do this here, but that needs a more explicitly stated motivation, evaluated in a review.	2021-04-21 09:55:25 +02:00
Zakk Chen	ad0fe5db2f	[RISCV][MC] Mask load should not have VMConstraint. Add a test, dest register could be v0. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D100825	2021-04-21 15:21:37 +08:00
Serge Pavlov	d20a2376d8	[RISCV] Introduce floating point control and state registers New registers FRM, FFLAGS and FCSR was defined. They represent corresponding system registers. The new registers are necessary to properly order floating point instructions in non-default modes. Differential Revision: https://reviews.llvm.org/D99083	2021-04-21 12:55:30 +07:00
Zi Xuan Wu	ca31b43ae8	[NFC][CSKY] Resort the instruction description in td Resort the instruction description in td to make it easy to upstream more instructions and add predicts later.	2021-04-21 12:36:07 +08:00
George Balatsouras	79b5280a6c	[dfsan] Enable origin tracking with fast8 mode All related instrumentation tests have been updated. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D100903	2021-04-20 18:10:32 -07:00
Adrian Prantl	81cad0be68	Make sure PHIElimination doesn't copy debug locations across basic blocks. PHIElimination may insert copy instructions in multiple basic blocks. Moving debug locations across basic block boundaries would be misleading as illustrated by the test case. rdar://75463656 Differential Revision: https://reviews.llvm.org/D100886	2021-04-20 17:03:29 -07:00
Sam Clegg	103956170b	[WebAssembly] Update README. NFC. This is just a cleanup of the very high level stuff. I'm sure there is more to update here but I'll leave that to others and/or a followup. Differential Revision: https://reviews.llvm.org/D100888	2021-04-20 16:59:08 -07:00
Arthur Eubanks	326da4adcb	[FuncAttrs] Always preserve FunctionAnalysisManagerCGSCCProxy FunctionAnalysisManagerCGSCCProxy should not be preserved if any of its keys may be invalid. Since we are not removing/adding functions in FuncAttrs, it's fine to preserve it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D100893	2021-04-20 16:37:45 -07:00
Reid Kleckner	91f7a4fff7	Revert "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)" This reverts commit `13ec913bdf`. This commit introduces new uses of the overflow checking intrinsics that depend on implementations in compiler-rt, which Windows users generally do not link against. I filed an issue (somewhere) to make clang auto-link the builtins library to resolve this situation, but until that happens, it isn't reasonable for the optimizer to introduce new link time dependencies.	2021-04-20 15:53:34 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Roman Lebedev	5a654bfeab	Revert "[InstCombine] `sext(trunc(x)) --> sext(x)` iff trunc is NSW (PR49543)" I forgot about the case where we sign-extend to width smaller than the original. This reverts commit `1e6ca23ab8`.	2021-04-21 01:11:15 +03:00
Roman Lebedev	1e68d338c1	Revert "[InstCombine] "Bypass" NUW trunc of lshr if we are going to sext the result (PR49543)" I forgot about the case where we sign-extend to width smaller than the original. This reverts commit `41b71f718b`.	2021-04-21 01:11:14 +03:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Roman Lebedev	41b71f718b	[InstCombine] "Bypass" NUW trunc of lshr if we are going to sext the result (PR49543) This is a more convoluted form of the same pattern "sext of NSW trunc", but in this case the operand of trunc was a right-shift, and the truncation chops off just the zero bits that were shifted-in.	2021-04-21 00:31:46 +03:00
Roman Lebedev	ea1a0d7c9a	[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543) We already special-cased a few interesting patterns, but that is strictly less powerful than using KnownBits. So instead get the known bits for the operand of `and`, and iff all the unset bits of the `and`-mask are known to be zeros in the operand, we can omit said `and`.	2021-04-21 00:31:46 +03:00
Roman Lebedev	1e6ca23ab8	[InstCombine] `sext(trunc(x)) --> sext(x)` iff trunc is NSW (PR49543) If we can tell that trunc only chops off sign bits, and not all of them, then we can simply sign-extend the trunc's source.	2021-04-21 00:31:45 +03:00
Sanjay Patel	1e202e8f39	[InstCombine] fold shift-of-srem-by-2 to mask+shift There are several potential srem-by-2 folds because the result is known {-1,0,1}. https://alive2.llvm.org/ce/z/LuVyeK	2021-04-20 17:10:16 -04:00
Sam Clegg	d2de2d1724	[WebAssembly] Remove unused known_gcc_test_failures.txt. NFC Differential Revision: https://reviews.llvm.org/D100887	2021-04-20 14:07:25 -07:00
Alexey Bataev	673e2f1b70	[COST][AARCH64] Improve cost of reverse shuffles for AArch64. Introduced the cost of thre reverse shuffles for AArch64, currently just copied the costs for PermuteSingleSrc. Differential Revision: https://reviews.llvm.org/D100871	2021-04-20 13:47:56 -07:00
Philip Reames	6792e26c0d	Reapply "Look through invertible recurrences in isKnownNonEqual" I'd reverted this in commit `3b6acb1797` due to buildbot failures. This patch contains the fix for said issue. I'd forgotten to handle the case where two phis in the same block have different operand order. We canonicalize away from this, but it's still valid IR. The tests included in this change (as opposed to simply having test output changed), crashed without the fix. Original commit message follows... This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle. (Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.) Differential Revision: https://reviews.llvm.org/D99912	2021-04-20 12:47:59 -07:00
Jon Roelofs	167da6c9e8	[AArch64][GlobalISel] Clarify fallback debug print ... to only print when that fallback actually happens.	2021-04-20 12:41:14 -07:00
Thomas Lively	693d767c60	[WebAssembly] More codegen for f64x2.convert_low_i32x4_{s,u} `af7925b4dd` added a custom DAG combine for recognizing fp-to-ints of extract_subvectors that could be lowered to f64x2.convert_low_i32x4_{s,u} instructions. This commit extends the combines to recognize equivalent extract_subvectors of fp-to-ints as well. Differential Revision: https://reviews.llvm.org/D100790	2021-04-20 12:37:13 -07:00
Philip Reames	3b6acb1797	Revert "Look through invertible recurrences in isKnownNonEqual" This reverts commit `be20eae25f`. It appears to have caused a crash on a buildbot (https://lab.llvm.org/buildbot#builders/77/builds/5653). Reverting while investigating.	2021-04-20 11:47:10 -07:00
Philip Reames	9c1a145aeb	Rearrange code to reduce diff for D99687 [nfc] Adding the switches to reduce diffs. I'm about to split that into an lshr part and an ashr part, doing the NFC part first makes it easier to maintain both diffs.	2021-04-20 11:40:15 -07:00
Roman Lebedev	13ec913bdf	[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769) We already had support for it's unsigned variant, so simply extend it to also handle the signed variant. Fixes https://bugs.llvm.org/show_bug.cgi?id=48769	2021-04-20 21:29:43 +03:00
Roman Lebedev	7186764884	[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr()	2021-04-20 21:29:21 +03:00
Philip Reames	be20eae25f	Look through invertible recurrences in isKnownNonEqual This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle. (Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.) Differential Revision: https://reviews.llvm.org/D99912	2021-04-20 10:52:22 -07:00
Joseph Huber	b2ad63d3cf	[OpenMP] Add OpenMPOpt as a Module pass Summary: This patch registers OpenMPOpt as a Module pass in addition to a CGSCC pass. This is so certain optimzations that are sensitive to intact call-sites can happen before inlining. The old `openmpopt` pass name is changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass. The current module pass only runs a single check but will be expanded in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99202	2021-04-20 12:28:58 -04:00
Simon Pilgrim	bc98076ff6	Silence MSVC signed/unsigned comparison warning. NFCI.	2021-04-20 17:20:13 +01:00
Simon Pilgrim	2a419a0b99	[X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements	2021-04-20 17:09:49 +01:00
Alexey Bataev	af870e11ae	[SLP] Add detection of shuffled/perfect matching of tree entries. SLP supports perfect diamond matching for the vectorized tree entries but do not support it for gathered entries and does not support non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds support for this matching to improve cost of the vectorized tree. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D100495	2021-04-20 09:08:46 -07:00
Philip Reames	3b1474cab2	free(nullptr) does not violate the nofree specification This fixes a subtle and nasty bug in my `86664638`. The problem is that free(nullptr) is well defined (and common). The specification for the nofree attributes talks about memory objects, and doesn't explicitly address null, but I think it's reasonable to assume that nofree doesn't disallow a call to free(nullptr). If it did, we'd have to prove nonnull on an argument to ever infer nofree which doesn't seem to be the intent. This was found by Nuno and Alive2 over in https://reviews.llvm.org/D100141#2697374. Differential Revision: https://reviews.llvm.org/D100779	2021-04-20 09:08:05 -07:00
Matt Arsenault	620fdb9671	GlobalISel: Defer register creation in handleAssignments This is currently built on top of the SelectionDAG call lowering, but does not use it the same way. SelectionDAG passes legalized types to the assignment functions, and the tablegenerated assignment functions may change the value types expected for registers. This does not change the types used, just moves the register creation to help fix this in the future. Defer the register creation until after all of the assignment decisions have been made. This will also help have correct tail call compatibility checking in a future change. Currently it does not work as expected for any arguments split across multiple registers.	2021-04-20 11:48:12 -04:00
Jay Foad	ec8c61efdf	[AMDGPU] Allow multiple uses of the same literal In GFX10 VOP3 can have a literal, which opens up the possibility of two operands using the same literal value, which is allowed and only counts as one use of the constant bus. AMDGPUAsmParser::validateConstantBusLimitations already knew about this but SIInstrInfo::verifyInstruction did not. Differential Revision: https://reviews.llvm.org/D100770	2021-04-20 16:44:01 +01:00
Ahmed Bougacha	a0573b6c10	[AArch64] Bump apple-latest CPU alias to apple-a14.	2021-04-20 08:41:04 -07:00
Ahmed Bougacha	a8a3a43792	[AArch64] Add apple-m1 CPU, and default to it for macOS. apple-m1 has the same level of ISA support as apple-a14, so this is a straightforward mechanical change. However, that also means this inherits apple-a14's v8.5a+nobti quirkiness. rdar://68287159	2021-04-20 08:41:04 -07:00
Matt Arsenault	14b03b4aad	GlobalISel: Check for powers of 2 for inverse funnel shift lowering This doesn't make a practical difference since it would only be broken if a target actually had a legal non-power-of-2 inverse shift.	2021-04-20 11:30:22 -04:00
Alexey Bataev	b82344a019	Revert "[SLP] Add detection of shuffled/perfect matching of tree entries." This reverts commit `daf6e18c55` to fix the compiler crash.	2021-04-20 08:29:32 -07:00
David Green	21a8b9d9e9	[ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal. The generic SoftFloatVectorExtract.ll test was failing when run on arm machines, as it tries to create a f64 under soft float. Limit the transform to when f64 is legal. Also add a missing override, as reported in D100244.	2021-04-20 16:24:36 +01:00
Matt Arsenault	1cb8a9d595	AMDGPU/GlobalISel: Fix uitofp/sitofp with non-power-of-2 integers	2021-04-20 11:13:29 -04:00
Matt Arsenault	83a25a1010	GlobalISel: Restrict narrow scalar for fptoui/fptosi results This practically only works for the f16 case AMDGPU uses, not wider types. Fixes bug 49710 by failing legalization.	2021-04-20 10:54:40 -04:00
Matt Arsenault	8fbe04f46b	MachineVerifier: Continue reporting errors for copies This was skipping verification of later copies, but generally the verifier tries to report as many things wrong as possible in the function.	2021-04-20 10:54:40 -04:00
Alexey Bataev	daf6e18c55	[SLP] Add detection of shuffled/perfect matching of tree entries. SLP supports perfect diamond matching for the vectorized tree entries but do not support it for gathered entries and does not support non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds support for this matching to improve cost of the vectorized tree. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D100495	2021-04-20 07:46:49 -07:00
Bradley Smith	b8b075d8d7	[AArch64][SVE] Lower MULHU/MULHS nodes to umulh/smulh instructions Mark MULHS/MULHU nodes as legal for both scalable and fixed SVE types, and lower them to the appropriate SVE instructions. Additionally now that the MULH nodes are legal, integer divides can be expanded into a more performant code sequence. Differential Revision: https://reviews.llvm.org/D100487	2021-04-20 15:18:06 +01:00
Alexey Bataev	cf00cb8bed	Revert "[SLP] Add detection of shuffled/perfect matching of tree entries." This reverts commit `b232771aca` to fix buildbots.	2021-04-20 07:16:11 -07:00
David Green	48cef1fa8e	[ARM] Create VMOVRRD from adjacent vector extracts This adds a combine for extract(x, n); extract(x, n+1) -> VMOVRRD(extract x, n/2). This allows two vector lanes to be moved at the same time in a single instruction, and thanks to the other VMOVRRD folds we have added recently can help reduce the amount of executed instructions. Floating point types are very similar, but will include a bitcast to an integer type. This also adds a shouldRewriteCopySrc, to prevent copy propagation from DPR to SPR, which can break as not all DPR regs can be extracted from directly. Otherwise the machine verifier is unhappy. Differential Revision: https://reviews.llvm.org/D100244	2021-04-20 15:15:43 +01:00
Alexey Bataev	b232771aca	[SLP] Add detection of shuffled/perfect matching of tree entries. SLP supports perfect diamond matching for the vectorized tree entries but do not support it for gathered entries and does not support non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds support for this matching to improve cost of the vectorized tree. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D100495	2021-04-20 06:55:55 -07:00
Cullen Rhodes	f166d0db71	[AArch64][AsmParser] NFC: Remove unused ExtendOp struct Left over from `2625a993f9` when extend and shift were merged.	2021-04-20 13:45:09 +00:00

1 2 3 4 5 ...

146169 Commits