llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	633f5663c3	[LegacyPM] Remove ThinLTO bitcode writer legacy pass Using the legacy PM for the optimization pipeline is deprecated and in the process of being removed. This is a small step in that direction. For an example of migrating to the new PM: `853b57fe80`	2022-08-15 14:21:16 -07:00
Philip Reames	e792a353b5	[slp] adjust debug output to include final computed cost	2022-08-15 13:51:39 -07:00
Jameson Nash	3a8d7fe201	[SimplifyCFG] teach simplifycfg not to introduce ptrtoint for NI pointers SimplifyCFG expects to be able to cast both sides to an int, if either side can be case to an int, but this is not desirable or legal, in general, per D104547. Spotted in https://github.com/JuliaLang/julia/issues/45702 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128670	2022-08-15 15:11:48 -04:00
Alexey Bataev	2819126d0c	[SLP][NFC]Replace multiple isa calls with single one where possible, NFC.	2022-08-15 11:56:58 -07:00
Kazu Hirata	71d12bc2de	[ExecutionEngine] Fix warnings This patch fixes: llvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp:512:12: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move] and: llvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp:515:12: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]	2022-08-15 10:26:03 -07:00
Sunho Kim	0c69f9f32c	[ORC][COFF] Introduce DLLImportDefinitionGenerator. This class will be used to properly solve the `__imp_` symbol and jump-thunk generation issues. It is assumed to be the last definition generator to be called, and as it's the last generator the only symbols remaining in the lookup set are the symbols that are supposed to be queried outside this jitdylib. Instead of just letting them through, we issue another lookup invocation and fetch the allocated addresses, and then create jitlink graph containing `__imp_` GOT symbols and jump-thunks targetting the fetched addresses. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D131833	2022-08-16 02:06:57 +09:00
Sanjay Patel	e5748c6e73	[InstCombine] reduce sub-with-overflow ==/!= 0 The basic patterns look like this: https://alive2.llvm.org/ce/z/MDj9EC The tests have a use of the overflow value too. Otherwise, existing folds should reduce already. This was noted as a missing IR fold in: `926e7312b2` Hopefully, this makes it easier to implement a backend fix because we should get the same IR regardless of whether the source used builtins or inline code.	2022-08-15 13:03:51 -04:00
Craig Topper	7a73ab5818	[RISCV] Enable isTruncateFree in SDAG for i64->i32 on rv64. We have a good selection of W instructions, so promoting a truncated value back to i64 is often free. This appears to be a net code size reduction on SPECINT2006. This has been split from D130397 as one of the patches needed to complete that. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D131819	2022-08-15 08:32:51 -07:00
Craig Topper	ef8c34e954	[InstSimplify] sle on i1 also encodes implication We already support SGE, so the same logic should hold for SLE with the LHS and RHS swapped. I didn't see this in the wild. Just happened to walk past this code and thought it was odd that it was asymmetric in what condition codes it handled. Reviewed By: spatel, reames Differential Revision: https://reviews.llvm.org/D131805	2022-08-15 08:27:23 -07:00
Simon Pilgrim	a7b85e4c0c	[X86] Freeze shl(x,1) -> add(x,x) vector fold (PR50468) Vector fold shl(x,1) -> add(freeze(x),freeze(x)) to avoid the undef issues identified in PR50468 Differential Revision: https://reviews.llvm.org/D106675	2022-08-15 16:17:21 +01:00
Simon Pilgrim	41bdb8cd36	[X86] Fold insert_vector_elt(undef, elt, 0) --> scalar_to_vector(elt) I had hoped to make this a generic fold in DAGCombine, but there's quite a few regressions in Thumb2 MVE that need addressing first. Fixes regressions from D106675.	2022-08-15 14:56:30 +01:00
David Green	dfc95bab07	[DAG] Ensure more Legal BUILD_VECTOR elements types in shuffle->And combine This is a followup to D131350, which caused another problem for i64 types being split into i32 on i32 targets. This patch tries to make sure that either Illegal types are OK, or that the element types of a buildvector are legal and bigger than or equal to the size of the original elements. Differential Revision: https://reviews.llvm.org/D131883	2022-08-15 14:41:45 +01:00
Luo, Yuanke	853bb192c4	Revert "(Reland) [fastalloc] Support allocating specific register class in fastalloc" This reverts commit `30f9e6ebd3`.	2022-08-15 20:33:15 +08:00
Ayke van Laethem	a560e57a7e	[AVR] Only push and clear R1 in interrupts when necessary R1 is a reserved register, but LLVM gives the APIs to know when it is used or not. So this patch uses these APIs to only save/clear/restore R1 in interrupts when necessary. The main issue here was getting inline assembly to work. One could argue that this is the job of Clang, but for consistency I've made sure that R1 is always usable in inline assembly even if that means clearing it when it might not be needed. Information on inline assembly in AVR can be found here: https://www.nongnu.org/avr-libc/user-manual/inline_asm.html#asm_code Essentially, this seems to suggest that r1 can be freely used in avr-gcc inline assembly, even without specifying it as an input operand. Differential Revision: https://reviews.llvm.org/D117426	2022-08-15 14:29:38 +02:00
Ayke van Laethem	43a8dbc5be	[AVR] Use @earlyclobber instead of register scavenging The code to support the case when the register allocator has assigned the same register to the src and the dst register operand isn't actually needed: * LDWRdPtr and LDDWRdPtrQ have an @earlyclobber on the output register, so the register allocator will make sure to allocate a different register for the output register. * LDDWRdYQ does not have an @earlyclobber, but the pointer register is the fixed Y register which is reserved. The register allocator won't use reserved registers for the output value. This removes a special case in the code that makes the pseudo instruction expansion pass more complicated than it needs to be. Differential Revision: https://reviews.llvm.org/D131844	2022-08-15 14:29:38 +02:00
Ayke van Laethem	de48717fcf	[AVR] Support unaligned store This patch really just extends D39946 towards stores as well as loads. While the patch is in SelectionDAGBuilder, it only applies to AVR (the only target that supports unaligned atomic operations). Differential Revision: https://reviews.llvm.org/D128483	2022-08-15 14:29:37 +02:00
Max Kazantsev	354fa0b480	Revert "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `34ae308c73`. Our internal testing found a miscompile. Not sure if it's caused by this patch or it revealed something else. Reverting while investigating.	2022-08-15 18:51:59 +07:00
Simon Pilgrim	3a73133217	[DAG] canCreateUndefOrPoison - add freeze(sign_extend_inreg(x,vt)) -> sign_extend_inreg(freeze(x),vt) support Guaranteed not to create undef/poison	2022-08-15 12:18:59 +01:00
Peter Waller	6e85db7293	[DAGCombine] Combine signext_inreg of extract-extend The outer signext_inreg is redundant in the following: Fold (signext_inreg (extract_subvector (zext\|anyext\|sext iN_value to _) _) from iN) -> (extract_subvector (signext iN_value to iM)) Tests are precommitted and clone those by analogy from the AND case in the same file. Add a negative test to check extension width is handled correctly. This patch supersedes D130700. Differential Revision: https://reviews.llvm.org/D131503	2022-08-15 10:58:07 +00:00
Simon Pilgrim	7e294e676e	[DAG] canCreateUndefOrPoison - add freeze(assertsext/zext(x,bt)) -> assertsext/zext(freeze(x),vt) support These are guaranteed not to create undef/poison (although they may pass through) - the associated ISD::VALUETYPE node is also guaranteed never to generate poison	2022-08-15 11:13:43 +01:00
Benjamin Kramer	982779230f	Make demangler independent of LLVM again The demangler is not supposed to include bits of LLVM, so it can't use STLExtras. This undoes part of `6d9cd9199a`	2022-08-15 11:44:28 +02:00
Fangrui Song	d797c2ffdb	[DebugInfo] -fdebug-prefix-map: handle '#line "file"' for asm source `getContext().setMCLineTableRootFile` (from D62074) sets `RootFile.Name` to `FirstCppHashFilename`. `RootFile.Name` is not processed by -fdebug-prefix-map and will go to DW_TAG_compile_unit's DT_AT_name and DW_TAG_label's DW_AT_decl_file. Remap `RootFile.Name`. Fix another issue reported by https://github.com/llvm/llvm-project/issues/56609 Reviewed By: #debug-info, dblaikie, raj.khem Differential Revision: https://reviews.llvm.org/D131848	2022-08-14 20:58:23 -07:00
Kazu Hirata	f5a68feab3	Use llvm::none_of (NFC)	2022-08-14 16:25:39 -07:00
Kazu Hirata	6d9cd9199a	Use llvm::all_of (NFC)	2022-08-14 16:25:36 -07:00
Krzysztof Parzyszek	40ba78679d	[Hexagon] Distribute disjoint intervals at the end of expand-condsets This fixes https://github.com/llvm/llvm-project/issues/56050.	2022-08-14 16:15:23 -05:00
Krzysztof Parzyszek	98bd252432	[Hexagon] Make some loops in HexagonExpandCondsets.cpp range-based, NFC Plus some readability changes.	2022-08-14 16:15:06 -05:00
Nuno Lopes	0299ebc1bd	InstCombine: use poison instead of undef as placeholder in insertvalue [NFC] These vectors are fully initialized so the placeholder value is irrelevant	2022-08-14 21:37:23 +01:00
Kazu Hirata	50724716cd	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-14 12:51:58 -07:00
Kazu Hirata	9144e49334	[Support] Drop unnecessary const from a return type (NFC) Identified with readability-const-return-type.	2022-08-14 12:51:56 -07:00
Lang Hames	8b9b45ce54	[JITLink] Fix some missing std::moves. This should fix failures on some bots due to `1cf81274f4` (e.g. https://lab.llvm.org/buildbot#builders/196/builds/16684)	2022-08-14 11:42:26 -07:00
Simon Pilgrim	cc6d3f07f4	[M68k] Fix MSVC llvm::Optional<> deprecation warnings Use has_value()/value() instead of hasValue()/getValue()	2022-08-14 18:54:41 +01:00
Lang Hames	1cf81274f4	[JITLink] Add eh-frame CFI inspector, fix crash on malformed FDEs. Add a fix to check that FDE pc-begin targets are defined before calling getBlock (which will crash if the target is not defined). FDE pc-begins pointing at undefined symbols are expected to arise only in obscure circumstances (malformed objects, or removal of targets by JITLink passes), but we want to handle them gracefully. With this patch the FDE will be retained, but without any keepalive edge to it. Unless some pass takes action to mark it as live it will be dead-stripped. To make it easier for passes to connect FDEs to their targets a new EHFrameCFIBlockInspector utility is added. This allows clients to quickly determine whether a CFI record is a CIE or an FDE (assuming that it's valid), and retrieve any personality, pc-begin, cie, or LSDA edges associated with it.	2022-08-14 10:49:26 -07:00
Simon Pilgrim	8b47e29fa0	[X86] combineVectorShiftImm - fold (shl (add X, X), C) -> (shl X, (C + 1)) Noticed while investigating the regressions in D106675	2022-08-14 17:42:02 +01:00
Simon Pilgrim	e2d13fd096	[DAG] canCreateUndefOrPoison - add freeze(shl(x,y)) -> shl(freeze(x),y) support These are guaranteed not to create undef/poison if the shift amount is known to be in range	2022-08-14 14:38:10 +01:00
Simon Pilgrim	a621d38bcb	[DAG] canCreateUndefOrPoison - add freeze(and/or/xor(x,y)) -> and/or/xor(freeze(x),y) support These are guaranteed not to create undef/poison	2022-08-14 13:14:53 +01:00
Anubhab Ghosh	23d0e71fcb	[Orc] Use IntervalMap to store free memory regions in MapperJITLinkMemoryManager MapperJITLinkMemoryManager uses a free list to keep track of available memory regions. Using an IntervalMap instead of vector allow automatic coalescing of memory regions as they are freed. Differential Revision: https://reviews.llvm.org/D131831	2022-08-14 14:35:08 +05:30
Phoebe Wang	8b69549dc5	[X86][FP16] Promote FP16->[U]INT to FP16->FP32->[U]INT This is to avoid f16->i64 being lowered to `__fixhfdi/__fixunshfdi` on 32-bits since neither libgcc nor compiler-rt provide them. https://godbolt.org/z/cjWEsea5v It also helps to improve the performance by promoting the vector type. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D131828	2022-08-14 09:37:33 +08:00
Vitaly Buka	f1596952f9	[AArch64] Fix signed integer overflow in CSINC case https://lab.llvm.org/staging/#/builders/224/builds/2/steps/16/logs/stdio Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D131815	2022-08-13 13:12:09 -07:00
Simon Pilgrim	60534b8879	[DAG] canCreateUndefOrPoison - add freeze(add/sub/mul(x,y)) -> add/sub/mul(freeze(x),y,z) support These are guaranteed not to create undef/poison as long as there are no poison generating flags	2022-08-13 20:58:00 +01:00
Kazu Hirata	448c466636	Use llvm::erase_value (NFC)	2022-08-13 12:55:50 -07:00
Kazu Hirata	109df7f9a4	[llvm] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-13 12:55:42 -07:00
Kazu Hirata	2117fcb1c0	Use Optional::transform instead of Optional::map (NFC) I'm planning to deprecate map in favor of transform for consistency with std::optional::transform in C++23.	2022-08-13 11:48:26 -07:00
Florian Hahn	c2af37dcdb	Revert "[AArch64][GlobalISel] Recognise some CCMPri" This reverts commit `38c2366b3f`. This patch seems to break boostraping LLVM with `-fglobal-isel -O3` on AArch64 hardware. Without the revert, there are 500+ test failures for the `check-llvm-codegen-x86` target.	2022-08-13 17:44:41 +01:00
Sanjay Patel	8b56fa92de	[InstCombine] fix "X\|(X^Y)" pattern-matching for commuted variants	2022-08-13 11:02:28 -04:00
Sanjay Patel	9d218b61cc	[InstCombine] reduce or-xor-or patterns (A \| ?) \| (A ^ B) --> (A \| ?) \| B https://alive2.llvm.org/ce/z/dbNQw4 This extends the existing transform to peek through another 'or' instruction for the common operand. This is the underlying missing fold that should allow issue #56711 and issue #57120 to reduce even more.	2022-08-13 09:52:01 -04:00
Sanjay Patel	763b31237f	[InstCombine] move comments closer to relevant code; NFC	2022-08-13 09:16:33 -04:00
Liqin.Weng	8a12606a7e	[AVR] Remove debug location of spill/reload instructions Reviewed By: MatzeB, benshi001 Differential Revision: https://reviews.llvm.org/D129262	2022-08-13 20:58:12 +08:00
LiaoChunyu	99ef0ddea3	[RISCV] Fold (sub constant, (setcc x, y, eq/neq)) -> (add constant - 1, (setcc x, y, neq/eq)) (setcc x, y, eq/neq) are seqz, snez that set rd = 0/1. addi is used to process immediate, which can save instructions for load immediate. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131471	2022-08-13 20:37:57 +08:00
Fangrui Song	e9b213131a	[Support] computeHostNumPhysicalCores: use sched_getaffinity for all non-Android Linux with no custom implementation Make the sched_getaffinity based implementation available to all architectures (except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)` code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`. The function is mainly used by in-process ThinLTO to decide the default number of threads. Returning -1 will use just one thread. Android is excluded because of the higher API level requirement: `sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`	2022-08-13 01:36:13 -07:00
Anubhab Ghosh	a31af32183	Reapply [Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager When memory is deallocated from MapperJITLinkMemoryManager deinitialize actions are run through mapper and in case of InProcessMapper, memory protections of the region are reset to read/write as they were previously changed and can be reused in future. Differential Revision: https://reviews.llvm.org/D131768	2022-08-13 13:07:50 +05:30
Luo, Yuanke	30f9e6ebd3	(Reland) [fastalloc] Support allocating specific register class in fastalloc Reland commit `719658d078` The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D131825	2022-08-13 13:57:34 +08:00
Craig Topper	37db283362	[RISCV] isImpliedByDomCondition returns an Optional<bool> not a bool. We were incorrectly checking that it returned an implicaton result, not that the implication result itself was true.	2022-08-12 22:21:05 -07:00
Sunho Kim	50f305017d	[ORC] Silence copy elision warning.	2022-08-13 14:17:43 +09:00
Sunho Kim	7332b18fa7	[ORC] Specify the typename.	2022-08-13 13:58:50 +09:00
Anubhab Ghosh	8180105143	Revert "[Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager" This reverts commit `143555b2ed`.	2022-08-13 10:22:31 +05:30
Sunho Kim	9189a26664	[ORC_RT][COFF] Initial platform support for COFF/x86_64. Initial platform support for COFF/x86_64. Completed features: * Statically linked orc runtime. * Full linking/initialization of static/dynamic vc runtimes and microsoft stl libraries. * SEH exception handling. * Full static initializers support * dlfns * JIT side symbol lookup/dispatch Things to note: * It uses vc runtime libraries found in vc toolchain installations. * Bootstrapping state is separated because when statically linking orc runtime it needs microsoft stl functions to initialize the orc runtime, but static initializers need to be ran in order to fully initialize stl libraries. * Process symbols can't be used blidnly on msvc platform; otherwise duplicate definition error gets generated. If process symbols are used, it's destined to get out-of-reach error at some point. * Atexit currently not handled -- will be handled in the follow-up patches. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D130479	2022-08-13 13:48:40 +09:00
Anubhab Ghosh	143555b2ed	[Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager When memory is deallocated from MapperJITLinkMemoryManager deinitialize actions are run through mapper and in case of InProcessMapper, memory protections of the region are reset to read/write as they were previously changed and can be reused in future. Differential Revision: https://reviews.llvm.org/D131768	2022-08-13 10:08:25 +05:30
Joe Loser	b12aa497cd	[DAGCombine] Replace std::monostate equivalent in DAGCombiner.cpp Remove the `UnitT` type and operators in favor of using `std::monostate` directly. Differential Revision: https://reviews.llvm.org/D131778	2022-08-12 21:42:09 -06:00
jacquesguan	0fe5f03eeb	[RISCV][NFC] Use nested namespace definations. Since we use C++17 now, we could use nested namespace definations to simplify code. Differential Revision: https://reviews.llvm.org/D131751	2022-08-13 09:56:59 +08:00
Fangrui Song	3329cec2f7	[DebugInfo] Don't join DW_AT_comp_dir and directories[0] for DWARF v5 line tables DWARF v5 6.2.4 The Line Number Program Header says: > The first entry is the current directory of the compilation. Each additional > path entry is either a full path name or is relative to the current directory of > the compilation. When forming a path, relative DW_AT_comp_dir and directories[0] are not supposed to be joined together. Fix getFileNameByIndex to special case DWARF v5 DirIdx == 0. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D131804	2022-08-12 14:01:52 -07:00
James Y Knight	4d7f9b7489	X86: Don't fold TEST into ADD ...@GOTTPOFF/GOTNTPOFF/INDNTPOFF The linker may convert such an ADD into a LEA, so we must not use the EFLAGS output. This causes miscompiles with -fsanitize=null after `bacdf80f42` added llvm.threadlocal.address -- previously, global variables were known to be non-null, but the intrinsic is not currently known to return nonnull. (That should be corrected, but it shouldn't've caused miscompiles!) Differential Revision: https://reviews.llvm.org/D131716	2022-08-12 20:52:00 +00:00
Fangrui Song	f62e60fb23	[MCDwarf] Respect -fdebug-prefix-map= for generated assembly debug info (DWARF v5) For generated assembly debug info, MCDwarfLineTableHeader::CompilationDir is an unmapped path set in MCContext::setGenDwarfRootFile. Remap it. A relative destination path of -fdebug-prefix-map= exposes a llvm-dwarfdump bug which joins relative DW_AT_comp_dir and directories[0]. Fix https://github.com/llvm/llvm-project/issues/56609 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D131749	2022-08-12 12:52:36 -07:00
Ilia Diachkov	df8713079b	[SPIRV] support capabilities and extensions This patch supports SPIR-V capabilities and extensions. In addition, it inserts decorations related to MIFlags and improves support of switches. Five tests are included to demonstrate the improvement. Differential Revision: https://reviews.llvm.org/D131221 Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-08-12 23:33:15 +03:00
Kevin Athey	532564de17	[MSAN] add flag to suppress storage of stack variable names with -sanitize-memory-track-origins Allows for even more savings in the binary image while simultaneously removing the name of the offending stack variable. Depends on D131631 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D131728	2022-08-12 11:59:53 -07:00
Ben Langmuir	79f34ae7fe	[llvm] Fix assertion when stat fails in remove_directories We were dereferencing an empty Optional if IgnoreErrors was true and the stat failed. rdar://60887887 Differential Revision: https://reviews.llvm.org/D131791	2022-08-12 11:32:04 -07:00
Wolfgang Pieb	7ddfb4dfeb	[Inlining] Introduce the function attribute "inline-max-stacksize" The value of the attribute is a size in bytes. It has the effect of suppressing inlining of functions whose stacksizes exceed the given value. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D129904	2022-08-12 11:07:18 -07:00
Arthur Eubanks	a3ac1cfaed	[SampleProfile] Fix non-determinism in promoteMergeNotInlinedContextSamples() We're seeing non-determinism with loading sample profiles. It seems to be related to the order in which we merge FunctionSamples in promoteMergeNotInlinedContextSamples(). Use a MapVector to iterate over NonInlinedCallSites in the order entries were inserted. Reviewed By: wenlei, davidxl Differential Revision: https://reviews.llvm.org/D131592	2022-08-12 10:13:25 -07:00
James Y Knight	20451cb06b	Update license on Unicode.org's ConvertUTF code. The code was relicensed by its owner (Unicode.org) a long time back, but we still had the old (problematic) license in our fork. Note that the source files have not been distributed from unicode.org since 2009 (due to being buggy and unmaintained upstream), but they were given this license before that. Fixes https://github.com/llvm/llvm-project/issues/32309 Differential Revision: https://reviews.llvm.org/D66390	2022-08-12 16:51:08 +00:00
Kevin Athey	ec277b67eb	[MSAN] Separate id ptr from constant string for variable names used in track origins. The goal is to reduce the size of the MSAN with track origins binary, by making the variable name locations constant which will allow the linker to compress them. Follows: https://reviews.llvm.org/D131415 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D131631	2022-08-12 08:47:36 -07:00
James Y Knight	59351fe340	SPIRV: Fix compilation in NDEBUG.	2022-08-12 14:00:39 +00:00
Simon Pilgrim	4de35f4bbf	[DAG] Add TODO to remove creation of INSERT_SUBVECTOR nodes from SimplifyMultipleUseDemandedBits SimplifyMultipleUseDemandedBits shouldn't be creating general nodes like this - although we allow bitcasts, even general constant folding is avoided. Removing it causes a number of regressions that need addressing first, but I've added a TODO for now.	2022-08-12 10:45:30 +01:00
Filipp Zhinkin	1626ee6a95	[DAGCombine] Hoist shifts out of a logic operations tree. Hoist and combine shift operations from logic operations tree: logic (logic (SH x0, s), y), (logic (SH x1, s), z) --> logic (SH (logic x0, x1), s), (logic y, z) The transformation improves code generated for some cases related to the issue https://github.com/llvm/llvm-project/issues/49541. Correctness: https://alive2.llvm.org/ce/z/pVqVgY https://alive2.llvm.org/ce/z/YVvT-q https://alive2.llvm.org/ce/z/W5zTBq https://alive2.llvm.org/ce/z/YfJsvJ https://alive2.llvm.org/ce/z/3YSyDM https://alive2.llvm.org/ce/z/Bs2kzk https://alive2.llvm.org/ce/z/EoQpzU https://alive2.llvm.org/ce/z/Jnc_5H https://alive2.llvm.org/ce/z/_LP6k_ https://alive2.llvm.org/ce/z/KvZNC9 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D131189	2022-08-12 12:42:16 +03:00
Max Kazantsev	a3d1fb3b59	[SCEV] Prove condition invariance via context Contextual knowledge may be used to prove invariance of some conditions. For example, in this case: ``` ; %len >= 0 guard(%iv = {start,+,1}<nuw> <s %len) guard(%iv = {start,+,1}<nuw> <u %len) ``` the 2nd check always fails if `start` is negative and always passes otherwise. It looks like there are more opportunities of this kind that are still to be implemented in the future. Differential Revision: https://reviews.llvm.org/D129753 Reviewed By: apilipenko	2022-08-12 14:23:35 +07:00
wanglian	061f7ec9fa	[LegalizeTypes][NFC] Use getConstantOperandVal instead of cast constant getvalue Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131642	2022-08-12 14:35:10 +08:00
wanglian	1303057888	[LegalizeTypes][NFC] Use dyn_cast instead of isa and cast Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131544	2022-08-12 14:18:49 +08:00
gonglingqin	9e09c3186e	[LoongArch] Add codegen support for ISD::CTPOP, ISD::CTTZ and ISD::CTLZ Differential Revision: https://reviews.llvm.org/D131550	2022-08-12 14:15:30 +08:00
Ting Wang	12e1936f64	[PowerPC] Add XXEVAL TD pattern Add xxeval TD pattern for P10 on: eqv, nor, or, xor. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D131654	2022-08-12 01:27:24 -04:00
Fangrui Song	b0c4cd35df	[MCDwarf] Use emplace to avoid move assignment. NFC	2022-08-12 05:05:49 +00:00
Chuanqi Xu	e190b7cc90	[Coroutines] Maintain the position of final suspend Closing https://github.com/llvm/llvm-project/issues/56329 The problem happens when we try to simplify the suspend points. We might break the assumption that the final suspend lives in the last slot of Shape.CoroSuspends. This patch tries to main the assumption and fixes the problem.	2022-08-12 13:05:08 +08:00
Chen Zheng	8d19cfb72e	[PowerPC] omit location attribute for TLS variable on AIX TLS debug on AIX is not ready for now. The location generated in no-integrated-as mode is wrong and in integrated-as mode causes AIX linker error. Reviewed By: Esme Differential Revision: https://reviews.llvm.org/D130245	2022-08-12 00:54:48 -04:00
Weining Lu	40f1f9b357	[LoongArch] Return null SDValue by default in LowerOperation. NFC Differential Revision: https://reviews.llvm.org/D131546	2022-08-12 12:09:08 +08:00
wanglian	3b71f1d5ab	[LegalizeTypes][NFC] Use getConstantOperandAPInt instead of cast constant getAPInt Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D131653	2022-08-12 10:21:54 +08:00
Mircea Trofin	3486b1b736	[mlgo][nfc] regalloc test model generator: prep for TFLite Casting operator to make TFLite happy. Reviewed By: yundiqian Differential Revision: https://reviews.llvm.org/D131584	2022-08-11 15:53:23 -07:00
Craig Topper	e493944f5f	[RISCV] Use SLTIU X, -1 for (setne X, -1). Since -1 is the maximum unsigned value, all values less than it are not equal to it.	2022-08-11 15:36:04 -07:00
Martin Storsjö	2c2fb0c737	[llvm] Use hidden visibility when building for MinGW with Clang Since `c5b3de6745` (git main, August 11th), Clang does generate working hidden visibility on MinGW targets. Using that reduces the number of exports from a dylib build of LLVM significantly, which is vital for fitting within the limit of 64k exported symbols from a DLL. It's essential that if we set CMAKE_CXX_VISIBILITY_PRESET=hidden (which passes -fvisibility=hidden on the command line), we also must define LLVM_EXTERNAL_VISIBILITY consistently to override it. (If there are mismatches, e.g. setting hidden visibility generally but never overriding it back to default for the symbols that do need to be exported, we'd get broken builds in such configurations.) We don't want to be using __attribute__((visibility("hidden"))) on MinGW with GCC, because GCC produces a warning about it. (GCC hasn't warned about the command line options that set hidden visibility though.) Clang has historically not warned about either of them, so it is harmless to use the hidden visibility when building with older Clang (so we don't need to detect the exact version of Clang/LLVM where it has an effect). This reduces the number of exported symbols for a dylib build of LLVM; previously libLLVM exported around 64650 symbols (when the maximum is 65536) when the ARM, AArch64 and X86 targets were enabled. If enabling more targets (or if building with e.g. assertions enabled), it would exceed the limit. Now with visibility flags in use, the same build with ARM, AArch64 and X86 ends up at around 35k exported symbols. Differential Revision: https://reviews.llvm.org/D131661	2022-08-12 00:57:05 +03:00
Craig Topper	2c79801a0e	[RISCV] Add more ineg+setcc isel patterns to avoid creating neg+xori+slti(u). Including patterns to select addiw if only the lower 32 bits are used. I'm not excited about adding this many patterns. I'm looking at whether we can create the xori during lowering and move the ineg patterns to DAGCombiner.	2022-08-11 14:24:09 -07:00
Fangrui Song	57f334d817	[Support] Remove Log2 workaround for Android API level < 18 The function added by D9467 is unneeded. https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has moved forward to at least a minimum target API of 19. Reviewed By: srhines Differential Revision: https://reviews.llvm.org/D131656	2022-08-11 17:39:41 +00:00
Simon Pilgrim	6ba5fc2dee	[X86] lowerShuffleWithVPMOV - support direct lowering to VPMOV on VLX targets lowerShuffleWithVPMOV currently only matches shuffle(truncate(x)) patterns, but on VLX targets the truncate isn't usually necessary to make the VPMOV node worthwhile (as we're only targetting v16i8/v8i16 shuffles we're almost always ending up with a PSHUFB node instead). PACKSS/PACKUS are still preferred vs VPMOV due to their lower uop count. Fixes the remaining regression from the fixes in rG293899c64b75	2022-08-11 17:40:07 +01:00
Kevin P. Neal	de64d0076e	[FPEnv][InstSimplify] Fix formatting error. My most recent change for D131607 had a formatting error that I didn't notice until after I committed it. Let me fix it now so changes to this file will be back-to-back from me.	2022-08-11 12:10:05 -04:00
Sanjay Patel	fa68d93d54	[InstCombine] fold reassociative fadd with negated operand We manage to iteratively achieve this result with no extra uses, and the reassociate pass can also do this, but this pattern falls through the cracks in the example from issue #57053.	2022-08-11 11:43:36 -04:00
Kevin P. Neal	7bdb010d7c	[FPEnv][InstSimplify] 0.0 - -X ==> X Another ticket split out of D107285, this extends the optimization of 0.0 - -X to just X when using constrained intrinsics and the optimization is allowed. If the negation of X is done with fsub then the match fails because of the lack of IR Matcher support for constrained intrinsics. While I'm here, remove some TODO notices since the work is no longer planned. Differential Revision: https://reviews.llvm.org/D131607	2022-08-11 11:35:33 -04:00
Simon Pilgrim	08a880509e	[X86] Add RDPRU instruction CPUID bit masks As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4	2022-08-11 16:07:36 +01:00
Peter Waller	898699831b	[DAGCombine] Check zext legality in zext-extract-extend combine Discussed in D131503. Fix to D130782.	2022-08-11 14:30:42 +00:00
Eric Astor	94fae7a581	[ms] [llvm-ml] Add support for nested PROC/ENDP pairs This is believed to match behavior by ML.EXE and ML64.EXE. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D131522	2022-08-11 14:19:02 +00:00
Anubhab Ghosh	0aaa74f7e6	[Orc] Reorder operations in ExecutorSharedMemoryMapperService shutdown Differential Revision: https://reviews.llvm.org/D131510	2022-08-11 19:34:10 +05:30
David Green	a9e9dd9a3a	[AArch64] Add bf16 select handling A bfloat select operation will currently crash, but is allowed from C. This adds handling for the operation, turning it into a FCSELHrrr if fullfp16 is present, or converting it to a FCSELSrrr if not. The FCSELSrrr is created via using INSERT_SUBREG/EXTRACT_SUBREG to convert the bf16 to a f32 and using the f32 pattern for FCSELSrrr. (I originally attempted to do this via a tablegen pattern, but it appears that the nzcv glue is places onto the wrong node, causing it to be forgotten and incorrect scheduling to be emitted). The FCSELSrrr can also be used for fp16 selects when +fullfp16 is not present, which helps avoid an unnecessary promotion to f32. Differential Revision: https://reviews.llvm.org/D131253	2022-08-11 14:20:36 +01:00
Simon Pilgrim	5dcf0c342b	[X86] lowerShuffleWithVPMOV - remove oneuse constraints on shuffle(trunc(x),undef) -> vpmov(x) lowering These were added in rG057bdd63 but shuffle combining has gotten a lot better at folding different vector widths since then.	2022-08-11 14:06:42 +01:00
David Stuttard	1d1cc05539	AMDGPU: mbcnt allow for non-zero src1 for known-bits Src1 for mbcnt can be a non-zero literal or register. Take this into account when calculating known bits. Differential Revision: https://reviews.llvm.org/D131478	2022-08-11 13:23:43 +01:00
Yeting Kuo	875694089d	[RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics. The patch uses peephole method to fold merge.vvm and unmasked intrinsics to masked intrinsics. Using peephole intead of tablegen patterns is to avoid large auto gnerated code. Note: The patch ignores segment loads since I don't know how to test them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D130442	2022-08-11 17:58:11 +08:00
Ilya Biryukov	7c80c4d677	[MC] NFC. Avoid redundant copies when constructing StructFieldInfo Follow-up after D131595, see comments in the review thread. The intention of having two constructors was to minimize the copies of `vector`, but a lack of `std::move` on the call site caused the wrong constructor to be called. Switched to a single constructor that accepts a value. Accepting by value allows to have a single constructor and still decide to copy or move on the call site.	2022-08-11 11:53:24 +02:00

1 2 3 4 5 ...

160922 Commits