llvm-project

Commit Graph

Author	SHA1	Message	Date
Jorge Gorbe Moya	fc7573f29c	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `46774df307`.	2022-03-31 14:54:41 -07:00
Matt Arsenault	0fb6856aff	ARM/GlobalISel: Get pointer type from value instead of getPointerSize Avoid using getPointerSize and pass through the original value type.	2022-03-31 16:46:23 -04:00
Matt Arsenault	395f8ccfc9	RegAllocGreedy: Fix typo	2022-03-31 16:30:01 -04:00
Stefan Pintilie	585c85abe5	[PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes. To store a byval parameter the existing code would store as many 8 byte elements as was required to store the full size of the byval parameter. For example, a paramter of size 16 would store two element of 8 bytes. A paramter of size 12 would also store two elements of 8 bytes. This would sometimes store too many bytes as the size of the paramter is not always a factor of 8. This patch fixes that issue and now byval paramters are stored with the correct number of bytes. Reviewed By: nemanjai, #powerpc, quinnp, amyk Differential Revision: https://reviews.llvm.org/D121430	2022-03-31 15:12:46 -05:00
Stefan Pintilie	2e55bc9f3c	[PowerPC] Set the special DSCR with a compiler option. Add a compiler option and the instructions required to set the special Data Stream Control Register (DSCR). The special register will not be set by default. Original patch by: Muhammad Usman Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D117013	2022-03-31 14:06:30 -05:00
Florian Hahn	14e3650f01	Revert "Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)"" This reverts commit `8378a71b6c`. It looks like this patch uncovered another issue, e.g. see https://lab.llvm.org/buildbot/#/builders/168/builds/5518	2022-03-31 19:00:48 +01:00
Paul Kirth	46774df307	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-31 17:38:21 +00:00
Thomas Symalla	1a6aa8b195	[AMDGPU] Add missing use check in SIOptimizeExecMasking pass. Whenever a v_cmp, s_and_saveexec instruction sequence shall be transformed to an equivalent s_mov, v_cmpx sequence, it needs to be detected if the v_cmp target register is used between the two instructions as the v_cmp result gets omitted by using the v_cmpx instruction, resulting in invalid code. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D122797	2022-03-31 19:25:35 +02:00
Simon Pilgrim	535211c3eb	[X86] Remove redundant FIXME lowerV64I8Shuffle has been extended a lot since this was added.	2022-03-31 18:05:52 +01:00
Simon Pilgrim	fac1729924	[X86] lowerV64I8Shuffle - don't use lowerShuffleWithPERMV until we've tried simpler options Shuffle combining will still lower to this with better fast cross lane checks. Noticed while triaging Issue #54658	2022-03-31 18:05:51 +01:00
Abinav Puthan Purayil	898d5776ec	[AMDGPU][GlobalISel] Scalarize add/sub with overflow ops in the legalizer Differential Revision: https://reviews.llvm.org/D122803	2022-03-31 21:46:34 +05:30
Peter Waller	f1cb816f90	[AArch64][SVE] Mark {CNT*,RDVL,INDEX} as materializable Differential Revision: https://reviews.llvm.org/D122731	2022-03-31 15:28:24 +00:00
Fraser Cormack	ee51aefba0	[RISCV][NFC] Minor formatting fix	2022-03-31 16:15:22 +01:00
Wenju He	0bda12b5bc	[NewPM] Add OptimizerEarly module extension point VectorizerStart extension is module callback in old PM, but is function callback in new PM. We lack a module extension point between end of buildModuleSimplificationPipeline and the function optimization (including vectorizer) pipeline. So this patch adds a new module extension point before the function optimization pipeline. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D122296	2022-03-31 08:22:27 -07:00
Changpeng Fang	1711020c37	AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal Summary: To compute the size of a VALU/SALU instruction, we need to check whether an operand could ever be literal. Previously isLiteralConstant was used, which missed cases like global variables or external symbols. These misses lead to under-estimation of the instruction size and branch offset, and thus incorrectly skip the necessary branch relaxation when the branch offset is actually greater than what the branch bits can hold. In this work, we use isLiteralConstantLike to check the operands. It maybe conservative, but it is safe. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D122778	2022-03-31 08:06:31 -07:00
Nikita Popov	33ac23e7cf	[Float2Int] Avoid unnecessary lamdbas (NFC) Instead of first creating a lambda for calculating the range, then collecting the ranges for the operands, and then calling the lambda on those ranges, we can first calculate the operand ranges and then calculate the result directly in the switch.	2022-03-31 16:13:13 +02:00
Nikita Popov	f66975555f	[Float2Int] Extract calcRange() method (NFC) This avoids the awkward "Abort" flag, because we can simply early-return instead.	2022-03-31 16:13:13 +02:00
Florian Hahn	8378a71b6c	Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)" This reverts the revert commit `2760cdc9c6`. This version pulls in the code to create the vector loop object in VPlan from D121624. This is needed because otherwise existing LoopInfo verification will fail, as a loop block doesn't have in-loop successors now that we do not replace the branch. Now that we do not add new loops during skeleton construction, there's also no need to verify LI there.	2022-03-31 14:48:32 +01:00
Sanjay Patel	4a54e3eed3	[x86] try to replace 0.0 in fcmp with negated operand This inverts a fold recently added to IR with: `3491f2f4b0` We can put -bidirectional on the Alive2 examples to show that the reverse transforms work: https://alive2.llvm.org/ce/z/8iVQwB The motivation for the IR change was to improve matching to 'fabs' in IR (see https://github.com/llvm/llvm-project/issues/38828 ), but it regressed x86 codegen for 'not-quite-fabs' patterns like (X > -X) ? X : -X. Ie, when there is no fast-math (nsz), the cmp+select is not a proper fabs operation, but it does map nicely to the unusual NAN semantics of MINSS/MAXSS. I drafted this as a target-independent fold, but it doesn't appear to help any other targets and seems to cause regressions for SystemZ at least. Differential Revision: https://reviews.llvm.org/D122726	2022-03-31 09:17:49 -04:00
Fraser Cormack	a276d1f44b	[RISCV][NFC] Fix formatting on one line	2022-03-31 13:17:37 +01:00
Serge Pavlov	47b3b76825	Implement inlining of strictfp functions According to the current design, if a floating point operation is represented by a constrained intrinsic somewhere in a function, all floating point operations in the function must be represented by constrained intrinsics. It imposes additional requirements to inlining mechanism. If non-strictfp function is inlined into strictfp function, all ordinary FP operations must be replaced with their constrained counterparts. Inlining strictfp function into non-strictfp is not implemented as it would require replacement of all FP operations in the host function, which now is undesirable due to expected performance loss. Differential Revision: https://reviews.llvm.org/D69798	2022-03-31 19:15:52 +07:00
Alexandros Lamprineas	b4417075dc	[FuncSpec] Constant propagate multiple arguments for recursive functions. This fixes a TODO in constantArgPropagation() to make it feature complete. However, I do find myself in agreement with the review comments in https://reviews.llvm.org/D106426. I don't think we should pursue specializing such recursive functions as the code size increase becomes linear to 'max-iters'. Compiling the modified test just with -O3 (no function specialization) generates the same code. Differential Revision: https://reviews.llvm.org/D122755	2022-03-31 13:00:08 +01:00
Florian Hahn	2760cdc9c6	Revert "[LV] Remove unneeded createHeaderBranch.(NFCI)" This reverts commit `32bc83d11e`. This is causing bots with expensive-checks to fail. Revert while I investigate.	2022-03-31 12:32:50 +01:00
Abinav Puthan Purayil	acf83abcbf	[AMDGPU][GlobalISel] Remove unused variable. NFC.	2022-03-31 16:50:34 +05:30
Luo, Yuanke	6753eb0c90	[X86][AMX] Materialize undef or zero value to tilezero The AMX combiner would store undef or zero to stack and invoke tileload to load the data to tile register. To avoid the store/load, we can materialzie undef or zero value to tilezero. Differential Revision: https://reviews.llvm.org/D122714	2022-03-31 19:10:28 +08:00
Florian Hahn	32bc83d11e	[LV] Remove unneeded createHeaderBranch.(NFCI) The only remaining use was to get the exit block of the loop. Instead of relying on the loop, use the successor of VectorHeaderBB (LoopMiddleBlock) directly to set VPTransformState::CFG::ExitB Depends on D121621. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121623	2022-03-31 11:48:52 +01:00
Nicholas Guy	7d676714fb	[AArch64] Set MaxBytesForLoopAlignment for more targets Differential Revision: https://reviews.llvm.org/D122566	2022-03-31 11:37:11 +01:00
Florian Hahn	2c494f0941	[VPlan] Remove unneeded Loop variable (NFC). Suggested in D121623. The remaining uses of L can be replaced, reducing the need for the variable.	2022-03-31 10:34:28 +01:00
Marco Elver	b8e49fdcb1	[AddressSanitizer] Allow prefixing memintrinsic calls in kernel mode Allow receiving memcpy/memset/memmove instrumentation by using __asan or __hwasan prefixed versions for AddressSanitizer and HWAddressSanitizer respectively when compiling in kernel mode, by passing params -asan-kernel-mem-intrinsic-prefix or -hwasan-kernel-mem-intrinsic-prefix. By default the kernel-specialized versions of both passes drop the prefixes for calls generated by memintrinsics. This assumes that all locations that can lower the intrinsics to libcalls can safely be instrumented. This unfortunately is not the case when implicit calls to memintrinsics are inserted by the compiler in no_sanitize functions [1]. To solve the issue, normal memcpy/memset/memmove need to be uninstrumented, and instrumented code should instead use the prefixed versions. This also aligns with ASan behaviour in user space. [1] https://lore.kernel.org/lkml/Yj2yYFloadFobRPx@lakrids/ Reviewed By: glider Differential Revision: https://reviews.llvm.org/D122724	2022-03-31 11:14:42 +02:00
Simon Pilgrim	481b185620	[X86] combineCarryThroughADD - recognise X86ISD::ADD(AND(X,1),-1) pattern can be folded to X86ISD::BT As mentioned on D122482, if we've generated a masked overflow test see if we can fold it to X86ISD::BT to feed a X86ISD::ADC/SBB Differential Revision: https://reviews.llvm.org/D122572	2022-03-31 09:52:55 +01:00
ShihPo Hung	2f1261abe4	[RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions This patch adds Uses = [FRM] and mayRaiseFPException = true to following instructions: VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV VFWADD, VFWSUB, VFWMUL VFMADD, VFMACC, VFMSAC, VFMSUB VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB VFWMACC, VFWMSAC, VFWNMACC, VFWNMSAC VFSQRT, VFREC7 VFREDOSUM, VFREDUSUM, VFWREDOSUM, VFWREDUSUM and only adds mayRaiseFPException = true to following instructions: VFRSQRT7, VFMIN, VFMAX, VFREDMIN, VFREDMAX VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121087	2022-03-31 01:33:17 -07:00
David Green	b65267ca7b	[LV] Invalidate widening decisions after maximizing vector bandwidth When MaximizeVectorBandwidth is enabled, we can end up (via calls to collectUniformsAndScalars/setCostBasedWideningDecision through calculateRegisterUsage) making widening decisions before we have decided whether to fold the tail by masking. These decisions will be wrong if we later decided to fold the tail, for example when the trip count is very low. It will use incorrect costs for loads that should get masked, using standard memory operation costs instead. This still at the moment uses the EmulatedMaskMemRefHack costs (a bit unfortunately), but the old costs without this change were 1, leading to too optimistic vectorization. This slightly changes the way that the MaximizeVectorBandwidth option works to make it easier to test, always honouring the option if it is set. Differential Revision: https://reviews.llvm.org/D120215	2022-03-31 09:19:31 +01:00
Fraser Cormack	893d63fbdc	[RISCV][NFC] Fix comment to refer to correct file	2022-03-31 08:59:10 +01:00
Argyrios Kyrtzidis	5426da8ffa	[Support/BLAKE3] Re-enable building with the simd-optimized implementations, v2 * Support compiling with clang-5 * Check for `LLVM_DISABLE_ASSEMBLY_FILES` and have it set by `compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh` which wants to receive and process only bitcode files.	2022-03-31 01:00:03 -07:00
Lian Wang	b3851e9931	[RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122369	2022-03-31 07:08:58 +00:00
Serge Pavlov	881350a92d	Mapping of FP operations to constrained intrinsics A new function 'getConstrainedIntrinsic' is added, which for any gived instruction returns id of the corresponding constrained intrinsic. If there is no constrained counterpart for the instruction or the instruction is already a constrained intrinsic, the function returns zero. This is recommit of `115b3ace36`, reverted in `8160dd582b`. Differential Revision: https://reviews.llvm.org/D69562	2022-03-31 11:07:47 +07:00
wanglei	a1c6743922	[LoongArch] Construct codegen infra and generate first add instruction. This patch constructs codegen infra and successfully generate the first 'add' instruction. Add integer calling convention for fixed arguments which are passed with general-purpose registers. New test added here: CodeGen/LoongArch/ir-instruction/add.ll The test file is placed in a subdirectory because we will use subdirctories to distinguish different categories of tests (e.g. intrinsic, inline-asm ...) Reviewed By: MaskRay, SixWeining Differential Revision: https://reviews.llvm.org/D122366	2022-03-31 11:57:07 +08:00
Aditya Kumar	368681f803	[GVNHoist] drop debug location according to the debug info guide According to the LLVM debug info update guide: https://llvm.org/docs/HowToUpdateDebugInfo.html, "Hoisting identical instructions which appear in several successor blocks into a predecessor block. In this case there is no single merged instruction. The rule for dropping locations applies". Thanks to Yuanbo Li for reporting this. Reviewed By: dblaikie Reviewers: sebpop, tejohnson, dblaikie Differential Revision: https://reviews.llvm.org/D122730	2022-03-30 20:17:53 -07:00
Stephen Long	e02f4976ac	[LoopIdiom] Merge TBAA of adjacent stores when creating memset Factor in the TBAA of adjacent stores instead of just the head store when merging stores into a memset. We were seeing GVN remove a load that had a TBAA that matched the 2nd store because GVN determined it didn't match the TBAA of the memset. The memset had the TBAA of only the first store. i.e. Loading the field pi_ of shared_count after memset to create an array of shared_ptr template<class T> class shared_ptr { T p; shared_count refcount; }; class shared_count { sp_counted_base pi_; }; Differential Revision: https://reviews.llvm.org/D122205	2022-03-30 16:54:49 -07:00
Florian Hahn	e4543af4e6	[VPlan] Track current vector loop in VPTransformState (NFC). Instead of looking up the vector loop using the header, keep track of the current vector loop in VPTransformState. This removes the requirement for the vector header block being part of the loop up front. A follow-up patch will move the code to generate the Loop object for the vector loop to VPRegionBlock. Depends on D121619. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121621	2022-03-30 22:16:40 +01:00
Fangrui Song	e572927f63	[AutoUpgrade] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-03-30 13:31:18 -07:00
Ben Barham	3fda0edc51	[VFS] RedirectingFileSystem only replace path if not already mapped If the `ExternalFS` has already remapped a path then the `RedirectingFileSystem` should not change it to the originally provided path. This fixes the original path always being used if multiple VFS overlays were provided and the path wasn't found in the highest (ie. first in the chain). This also renames `IsVFSMapped` to `ExposesExternalVFSPath` and only sets it if `UseExternalName` is true. This flag then represents that the `Status` has an external path that's different from its virtual path. Right now the contained path is still the external path, but further PRs will change this to always be the virtual path. Clients that need the external can then request it specifically. Note that even though `ExposesExternalVFSPath` isn't set for all VFS-mapped paths, `IsVFSMapped` was only being used by a hack in `FileManager` that was specific to module searching. In that case `UseExternalNames` is always `true` and so that hack still applies. Resolves rdar://90578880 and llvm-project#53306. Differential Revision: https://reviews.llvm.org/D122549	2022-03-30 11:52:41 -07:00
Craig Topper	4477500533	[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases Previously, these isel optimizations were disabled if the AND could be selected as a ANDI instruction. This patch disables the optimizations only if the immediate is valid for C.ANDI. If we can't use C.ANDI, we might be able to compress the shift instructions instead. I'm not checking the C extension since we have relatively poor test coverage of the C extension. Without C extension the code size should be equal. My only concern would be if the shift+andi had better latency/throughput on a particular CPU. I did have to add a peephole to match SRLIW if the input is zexti32 to prevent a regression in rv64zbp.ll. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122701	2022-03-30 11:46:42 -07:00
Craig Topper	7417eb29ce	[RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors. The splat_vector will be legalized to build_vector eventually anyway. This patch makes it take fewer steps. Unfortunately, this results in some codegen changes. It looks like it comes down to how the nodes were ordered in the topological sort for isel. Because the build_vector is created earlier we end up with a different ordering of nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122185	2022-03-30 11:36:34 -07:00
Chang-Sun Lin Jr	c28ce745cf	Value-number GVNHoist loads by result type as well as pointer address. Avoids merge errors when opaque pointers are loaded into different types. Reviewed by: jcranmer-intel, hiraditya Differential Revision: https://reviews.llvm.org/D122521	2022-03-30 11:33:49 -07:00
Craig Topper	85eae45520	[SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks. D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to ZERO_EXTEND to match assumptions in ComputePHILiveOutRegInfo. PHIs are probably not the only way ConstantSDNodeNodes can get to getCopyToRegs. This patch adds an ExtendType parameter to CopyValueToVirtualRegister and has HandlePHINodesInSuccessorBlocks pass ISD::ZERO_EXTEND for ConstantInts. This way we only affect ConstantSDNodes for PHIs. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122171	2022-03-30 11:32:43 -07:00
Florian Hahn	e8673f2f20	[LV] Do not create separate latch block in VPlan::execute. Now that all dependencies on creating the latch block up-front have been removed, there is no need to create it early. Depends on D121618. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121619	2022-03-30 17:31:38 +01:00
Fraser Cormack	73244e8f85	[VP] Add vp.icmp comparison intrinsic and docs This patch mostly follows up on D121292 which introduced the vp.fcmp intrinsic. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122729	2022-03-30 17:05:11 +01:00
Nikita Popov	d6887256c2	[AutoUpgrade] Don't upgrade intrinsics returning overloaded struct type We only want to do the upgrade from named to anonymous struct return if the intrinsic is declared to return a struct, but not if it has an overloaded return type that just happens to be a struct. In that case the struct type will be mangled into the intrinsic name and there is no problem. This should address the problem reported in https://reviews.llvm.org/D122471#3416598.	2022-03-30 17:27:26 +02:00
Sanjay Patel	436b875e49	[SDAG] avoid libcalls to fmin/fmax for soft-float targets This is an extension of D70965 to avoid creating a mathlib call where it did not exist in the original source. Also see D70852 for discussion about an alternative proposal that was abandoned. In the motivating bug report: https://github.com/llvm/llvm-project/issues/54554 ...we also have a more general issue about handling "no-builtin" options. Differential Revision: https://reviews.llvm.org/D122610	2022-03-30 11:22:03 -04:00

1 2 3 4 5 ...

156860 Commits