llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	c5e491e6ee	[SCEV] Modernize code style of isSCEVExprNeverPoison [NFC] Use for-range and all_of to make code easier to read in advance of other changes.	2021-09-30 15:13:43 -07:00
Amara Emerson	ca8316b704	[GlobalISel] Extend CombinerHelper::matchConstantOp() to match constant splat vectors. This allows the "x op 0 -> x" fold to optimize vector constant RHSs. Differential Revision: https://reviews.llvm.org/D110802	2021-09-30 14:31:25 -07:00
Craig Topper	a21c557955	[RISCV] Remove Zbproposedc extension This consists of 3 compressed instructions, c.not, c.neg, and c.zext.w. I believe these have been picked up by the Zce effort using different encodings. I don't think it makes sense to keep them in bitmanip. It will eventually cause a conflict if/when Zce is implemented in llvm. Differential Revision: https://reviews.llvm.org/D110871	2021-09-30 14:23:05 -07:00
Jon Chesterfield	3247329107	[openmp] Add addrspacecast to getOrCreateIdent Fixes 51982. Adds a missing CreatePointerCast and allocates a global in the correct address space. Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\ blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting parts while checking the assertion failure still occurred. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110556	2021-09-30 21:36:31 +01:00
Arnold Schwaighofer	2df2b27d94	[cora async] Cleanup undefined llvm.coro.async.resume In situations where the coroutine function is not split we can just replace the async.resume by null. rdar://82591919 Differential Revision: https://reviews.llvm.org/D110191	2021-09-30 13:26:53 -07:00
Florian Hahn	1fbdbb5595	Revert "Recommit "[SCEV] Look through single value PHIs." (take 2)" This reverts commit `764d9aa979`. This patch exposed a few additional cases where SCEV expressions are not properly invalidated. See PR52024, PR52023.	2021-09-30 20:53:51 +01:00
Maksim Panchenko	050edef853	[MC] Make MCDwarfLineStr class public Add MCDwarfLineStr class to the public API. Note that MCDwarfLineTableHeader::Emit(), takes MCDwarfLineStr as an Optional<> parameter making it impossible to use the API if the class is not publicly defined. Reviewed By: alexander-shaposhnikov Differential Revision: https://reviews.llvm.org/D109412	2021-09-30 12:31:59 -07:00
Albion Fung	4195ed9959	[PowerPC] Improved codegen related to xscvdpsxws/xscvdpuxws This patch removes the uneccessary mf/mtvsr generated in conjunction with xscvdpsxws/xscvdpuxws. Differential revision: https://reviews.llvm.org/D109902	2021-09-30 14:31:00 -05:00
Amara Emerson	80f4bb5c61	[GlobalISel] Extend G_SELECT of known condition combine to vectors. Adds a new utility function: isConstantOrConstantSplatVector(). Differential Revision: https://reviews.llvm.org/D110786	2021-09-30 12:16:44 -07:00
Sanjay Patel	3fcb00df5d	[InstCombine] restrict shift-trunc-shift fold to opposite direction shifts This is NFCI because the pattern with 2 left-shifts should get folded independently by smaller folds. The motivation is to refine this block to avoid infinite loops seen with D110170.	2021-09-30 15:06:13 -04:00
Nikita Popov	b989211d7d	[BasicAA] Move more extension logic into ExtendedValue (NFC) Add methods to appropriately extend KnownBits/ConstantRange there, same as with APInt. Also clean up the known bits handling by actually doing that extension rather than checking ZExtBits. This doesn't matter now, but becomes relevant once truncation is involved.	2021-09-30 20:45:12 +02:00
Stanislav Mekhanoshin	244aa7f735	[AMDGPU] move hasAGPRs/hasVGPRs into header It is now very simple and can go right into the header allowing optimizer to combine callers, such as isVGPRClass and similar. It does not need anything from the TRI itself anymore, so make it static class member along with the callers. Differential Revision: https://reviews.llvm.org/D110762	2021-09-30 10:02:02 -07:00
Nikita Popov	ea02f9caff	[BasicAA] Use ExtendedValue in VariableGEPIndex (NFC) Use the ExtendedValue structure which is used for LinearExpression in VariableGEPIndex as well.	2021-09-30 18:48:51 +02:00
Adrian Prantl	9232ca4712	Improve the effectiveness of BDCE's debug info salvaging This patch improves the effectiveness of BDCE's debug info salvaging by processing the instructions in reverse order and delaying dropAllReferences until after debug info salvaging. This allows salvaging of entire chains of deleted instructions! Previously we would remove all references from an instruction, which would make it impossible to use that instruction to salvage a later instruction in the instruction stream, because its operands were already removed. This reapplies the previous patch with a fix for a use-after-free. Differential Revision: https://reviews.llvm.org/D110568	2021-09-30 09:28:49 -07:00
Kazu Hirata	f631173d80	[llvm] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-09-30 08:51:21 -07:00
Anna Thomas	6f2d01376d	[LoopPredication] Remove unused variable After rG452714f8f8037ff37f9358317651d1652e231db2, the Function `F` retrieved in LoopPredication is not used. Remove this unused variable to stop some buildbots (ASAN, clang-ppc) from failing.	2021-09-30 10:40:47 -04:00
Anna Thomas	452714f8f8	[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResults This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI on the entire function every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON. Reviewed-By: asbirlea, modimo Differential Revision: https://reviews.llvm.org/D110438	2021-09-30 10:27:05 -04:00
David Green	f9aa8623fe	[ARM] Add more MVE intrinsics to sink splats to This adds a few more unpredicated intrinsics to sink splats to, in order to create more qr instruction variants. Notably this includes saddsat/uaddsat but also some of the unpredicated mve intrinsics. Differential Revision: https://reviews.llvm.org/D110333	2021-09-30 14:41:23 +01:00
Brock Wyma	bafd8b1add	[CodeView] Recognize Fortran95 as Fortran instead of MASM Map Fortran95 sources to Fortran so the CodeView language is not emitted as MASM. Differential Revision: https://reviews.llvm.org/D110330	2021-09-30 09:27:05 -04:00
Jingu Kang	13f3c39f36	Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit `c07f709969` with bug fixes. Differential Revision: https://reviews.llvm.org/D109963	2021-09-30 09:27:08 +01:00
Jay Foad	156d7d2df7	[LiveIntervals] Remove unused subreg ranges in repairIntervalsInRange If the old instructions mentioned a subreg that the new instructions do not, remove the subrange for that subreg. For example, in TwoAddressInstructionPass::eliminateRegSequence, if a use operand in the REG_SEQUENCE has the undef flag then we don't generate a copy for it so after the elimination there should be no live interval at all for the corresponding subreg of the def. This is a small step towards switching TwoAddressInstructionPass over from LiveVariables to LiveIntervals. Currently this path is only tested if you explicitly enable -early-live-intervals. Differential Revision: https://reviews.llvm.org/D110542	2021-09-30 09:15:10 +01:00
Clement Courbet	455b60ccfb	[AA] Teach BasicAA to recognize basic GEP range information. The information can be implicit (from `ValueTracking`) or explicit. This implements the backend part of the following RFC https://groups.google.com/g/llvm-dev/c/T9o51zB1JY. We still need to settle on how to best represent the information in the IR, but this is a separate discussion. Differential Revision: https://reviews.llvm.org/D109746	2021-09-30 08:29:32 +02:00
Ruiling Song	52785989e9	AMDGPU: Broadcast scalar boolean to vector boolean explicitly This is used to fix wrong code generation of s_add_co_select_user in test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll s_addc_u32 s4, s6, 0 s_cselect_b64 vcc, 1, 0 <-- vcc set as 0x1 if SCC==1 v_mov_b32_e32 v1, s4 s_cmp_gt_u32 s6, 31 v_cndmask_b32_e32 v1, 0, v1, vcc If the s_addc_u32 set SCC, then we will get value 0x1 in VCC. The v_cndmask will do per thread selection with VCC as condition register. As VCC only gets the first bit being set, only the first thread/lane in destination register can get correct result if the very first lane is active. In fact, we should broadcast the value to all active lanes of the final register. The idea here is doing this broadcast to vector boolean explicitly instead of lowering it into a COPY from SCC which would be interpreted as selecting between 0/1. This is used to replace D109754. Reviewed-by: foad, alex-t Differential Revision: https://reviews.llvm.org/D109889	2021-09-30 10:15:01 +08:00
Fangrui Song	8971b99c83	[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling convention or the vector calling convention. See https://github.com/riscv/riscv-elf-psabi-doc/pull/190 Differential Revision: https://reviews.llvm.org/D107949	2021-09-29 16:56:52 -07:00
Amara Emerson	1c0e8a98e4	[AArch64][GlobalISel] Widen G_BUILD_VECTOR source & dest element types to s8.	2021-09-29 15:11:30 -07:00
Nikita Popov	2898101552	[BasicAA] Move DecomposedGEP out of header (NFC) It's sufficient to have a forward declaration in the header, we can move the definition of the struct (and VariableGEPIndex) in the source file.	2021-09-29 23:45:15 +02:00
Nikita Popov	45288edb65	[BasicAA] Pass whole DecomposedGEP to subtraction API (NFC) Rather than separately handling subtraction of offset and variable indices, make this one operation. Also rewrite the implementation to use range-based for loops.	2021-09-29 23:32:15 +02:00
Sam McCall	22555bafe9	[VFS] InMemoryFilesystem's UniqueIDs are a function of path and content. This ensures that re-creating "the same" FS results in the same UIDs for files. In turn, this means that creating a clang module (preamble) using one in-memory filesystem and consuming it using another doesn't create duplicate FileEntrys for files that are the same in both FSes. It's tempting to give the creator control over the UIDs instead. However that requires fiddly API changes, e.g. what should the UIDs of intermediate directories be? This change is more "magic" but seems safe given: - InMemoryFilesystem is used in testing more than production - comparing UIDs across filesystems is unusual - files with the same path and content are usually logically equivalent (The usual reason for re-creating virtual filesystems rather than reusing them is that typical use involves mutating their CWD and so is not threadsafe). Differential Revision: https://reviews.llvm.org/D110711	2021-09-29 23:24:18 +02:00
Ricky Taylor	e1e3b6ee72	[M68k] Avoid UB in disassembler When reading 32 bits a 32-bit shift would be executed. This is undefined behaviour, but in this case we can just replace the entire scratch value to avoid it. Differential Revision: https://reviews.llvm.org/D110769	2021-09-29 22:07:14 +01:00
Nikita Popov	49813f7fbf	[BasicAA] Pass DecomposedGEP to constantOffsetHeuristic() (NFC) Rather than separately passing VarIndices and BaseOffset, pass the whole DecomposedGEP.	2021-09-29 22:23:27 +02:00
Joseph Huber	c11ebfea6d	[OpenMP][NFC] Fix linting messages in OpenMPOpt Summary: This patch addresses some linting messages I keep getting in my editor when working on OpenMPOpt.	2021-09-29 16:07:33 -04:00
Joseph Huber	87ce7e65f2	[OpenMP] Add missing distribute definitions to AAKernelInfo Summary: The RTL functions added in https://reviews.llvm.org/D110429 were mistakenly left out from the list of safe runtime calls in AAKernelInfo. This patch adds them in.	2021-09-29 16:06:34 -04:00
Wael Yehia	8b8da01d88	Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace." This reverts commit `a60405cf03`.	2021-09-29 19:43:35 +00:00
Stefan Pintilie	fb4e44c4e7	[PowerPC] The builtins load8r and store8r are Power 7 plus. This patch makes sure that the builtins __builtin_ppc_load8r and __ builtin_ppc_store8r are only available for Power 7 and up. Currently the builtins seem to produce incorrect code if used for Power 6 or before. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D110653	2021-09-29 14:34:40 -05:00
Sjoerd Meijer	367df18050	[LoopFlatten] Bail if we can't perform flattening after IV widening It can happen that after widening of the IV, flattening may not be possible, e.g. when it is deemed unprofitable. We were not properly checking this, which resulted in flattening being applied when it shouldn't, also leading to incorrect results (miscompilation). This should fix PR51980 (https://bugs.llvm.org/show_bug.cgi?id=51980) Differential Revision: https://reviews.llvm.org/D110712	2021-09-29 19:53:34 +01:00
Roman Lebedev	2d42a192e0	[X86][Costmodel] Load/store i8 Stride=2 VF=32 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/xz6x7c35P - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=2.5` So pick cost of `6`. For store we have: https://godbolt.org/z/xz6x7c35P - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0` So pick cost of `4`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110709	2021-09-29 21:52:45 +03:00
Roman Lebedev	bac60c55e0	[X86][Costmodel] Load/store i8 Stride=2 VF=16 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/a9hv4z47v - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: =2.0` So pick cost of `4`. For store we have: https://godbolt.org/z/6GfPn1b79 - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=2.0` So pick cost of `3`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110708	2021-09-29 21:52:45 +03:00
Roman Lebedev	1962185671	[X86][Costmodel] Load/store i8 Stride=2 VF=8 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 Identical to VF=2. For load we have: https://godbolt.org/z/4TEbdzbMM - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0` So pick cost of `2`. For store we have: https://godbolt.org/z/MYfzGPf3Y - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5` So pick cost of `1`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110705	2021-09-29 21:52:45 +03:00
Roman Lebedev	08face1f9a	[X86][Costmodel] Load/store i8 Stride=2 VF=4 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 Identical to VF=2. For load we have: https://godbolt.org/z/sGE41GYo7 - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0` So pick cost of `2`. For store we have: https://godbolt.org/z/ba5r3s9xa - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5` So pick cost of `1`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110704	2021-09-29 21:52:45 +03:00
Roman Lebedev	7d52628eb0	[X86][Costmodel] Load/store i8 Stride=2 VF=2 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/caKqjr9hb - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0` So pick cost of `2`. For store we have: https://godbolt.org/z/6TTn3eKj8 - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5` So pick cost of `1`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110702	2021-09-29 21:52:44 +03:00
Wesley Wiser	2dd883439c	[Mangler] Calculate the argument list byte count suffix correctly when returning large values `__stdcall`, `__fastcall` and `__vectorcall` return large values via a hidden pointer argument. However, the size of that argument should not be included in the argument list byte count suffix added to the function's decorated name. This patch fixes that issue so that LLVM generates the same decorated name as MSVC does. MSVC example: https://godbolt.org/z/nc35MKPhr Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D110719	2021-09-29 11:42:28 -07:00
Jay Foad	f9b68304a2	[AMDGPU] Enable machine verification after AMDGPUISelDAGToDAG This was introduced in D32628 but it does not seem to be required any more. At least it does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110692	2021-09-29 18:47:19 +01:00
Sanjay Patel	4414e2ad97	[InstSimplify] (-1 << x) s>> x --> -1 This was noticed in: https://llvm.org/PR51351 https://alive2.llvm.org/ce/z/aLxunD	2021-09-29 13:03:12 -04:00
Kazu Hirata	9a640a1cb8	[AArch64] Remove redundant declaration createAArch64ObjectTargetStreamer (NFC) Note that createAArch64ObjectTargetStreamer is declared in AArch64TargetStreamer.h and defined in AArch64TargetStreamer.cpp. Identified with readability-redundant-declaration.	2021-09-29 09:08:41 -07:00
David Green	e9adcbde31	[AArch64] Model Cortex-A55 Q register NEON instructions Cortex-A55 has 2 64bit NEON vector units, meaning a 128bit instruction requires taking both units (and can only be issued as the first instruction in a dual issue pair). This patch models that by splitting the WriteV SchedWrite into two - the WriteVd that reads/writes only 64bit operands, and the WriteVq that read/writes 128bit registers. The A55 schedule then uses this distinction to model the WriteVq as taking both resource units, and starting a Schedule Group and WriteVd as taking one as before. I believe this is more correct, even if it does not lead to much better performance. Differential Revision: https://reviews.llvm.org/D108766	2021-09-29 16:55:31 +01:00
Sanjay Patel	ea56dcb730	[InstCombine] fix miscompile from dropRedundantMaskingOfLeftShiftInput() The test is from https://llvm.org/PR51351. There are 2 related logic bugs from over-generalizing "lshr" to "any shr", but I'm not sure how to expose the difference for "MaskC" because instsimplify already folds ashr of -1. I'll extend instsimplify to catch the MaskD pattern as a follow-up, but this patch should be enough to avoid the miscompile.	2021-09-29 11:43:18 -04:00
Jay Foad	9886f21bc1	[MSP430] Recognize Bi as an indirect branch in analyzeBranch. NFC. Recognize Bi as an unconditional branch, just like JMP. This allows machine verification to run after MSP430BranchSelector without failing this assertion: virtual bool llvm::MSP430InstrInfo::analyzeBranch(llvm::MachineBasicBlock &, llvm::MachineBasicBlock &, llvm::MachineBasicBlock &, SmallVectorImpl<llvm::MachineOperand> &, bool) const: Assertion `I->getOpcode() == MSP430::JCC && "Invalid conditional branch"' failed. Note that machine verification is currently disabled after addPreEmitPass passes because of problems on other targets, so this is currently NFC. Differential Revision: https://reviews.llvm.org/D110691	2021-09-29 16:43:11 +01:00
Simon Pilgrim	676f2809b5	[CostModel][AArch64] Don't dereference CostTblEntry before null check. Fix static analysis warning that we check for null Entry after dereferencing it. I don't think this can actually happen as i8/i16 should legalize to use the i32 path which should return a cost - but I'd rather play it safe that rely on an implicit type legalization.	2021-09-29 16:35:29 +01:00
Sam Clegg	210cbcf476	[WebAssemlby][Object] Fix dead code in WasmObjectFile.cpp I introduced this by mistake in https://reviews.llvm.org/D109595. Differential Revision: https://reviews.llvm.org/D110717	2021-09-29 08:09:57 -07:00
David Green	8a645fc44b	[AArch64] Enable type promotion for AArch64 This enables the type promotion pass for AArch64, which acts as a CodeGenPrepare pass to promote illegal integers to legal ones, especially useful for removing extends that would otherwise require cross-basic-block analysis. I have enabled this generally, for both ISel and GlobalISel. In some quick experiments it appeared to help GlobalISel remove extra extends in places too, but that might just be missing optimizations that are better left for later. We can disable it again if required. In my experiments, this can improvement performance in some cases, and codesize was a small improvement. SPEC was a very small improvement, within the noise. Some of the test cases show extends being moved out of loops, often when the extend would be part of a cmp operand, but that should reduce the latency of the instruction in the loop on many cpus. The signed-truncation-check tests are increasing as they are no longer matching specific DAG combines. We also hope to add some additional improvements to the pass in the near future, to capture more cases of promoting extends through phis that have come up in a few places lately. Differential Revision: https://reviews.llvm.org/D110239	2021-09-29 15:13:12 +01:00

1 2 3 4 5 ...

151193 Commits