Commit Graph

160922 Commits

Author SHA1 Message Date
Arthur Eubanks 633f5663c3 [LegacyPM] Remove ThinLTO bitcode writer legacy pass
Using the legacy PM for the optimization pipeline is deprecated and in
the process of being removed. This is a small step in that direction.

For an example of migrating to the new PM:
853b57fe80
2022-08-15 14:21:16 -07:00
Philip Reames e792a353b5 [slp] adjust debug output to include final computed cost 2022-08-15 13:51:39 -07:00
Jameson Nash 3a8d7fe201 [SimplifyCFG] teach simplifycfg not to introduce ptrtoint for NI pointers
SimplifyCFG expects to be able to cast both sides to an int, if either side can be case to an int, but this is not desirable or legal, in general, per D104547.

Spotted in https://github.com/JuliaLang/julia/issues/45702

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128670
2022-08-15 15:11:48 -04:00
Alexey Bataev 2819126d0c [SLP][NFC]Replace multiple isa calls with single one where possible,
NFC.
2022-08-15 11:56:58 -07:00
Kazu Hirata 71d12bc2de [ExecutionEngine] Fix warnings
This patch fixes:

  llvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp:512:12: error:
  moving a temporary object prevents copy elision
  [-Werror,-Wpessimizing-move]

and:

  llvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp:515:12: error:
  moving a temporary object prevents copy elision
  [-Werror,-Wpessimizing-move]
2022-08-15 10:26:03 -07:00
Sunho Kim 0c69f9f32c [ORC][COFF] Introduce DLLImportDefinitionGenerator.
This class will be used to properly solve the `__imp_` symbol and jump-thunk generation issues. It is assumed to be the last definition generator to be called, and as it's the last generator the only symbols remaining in the lookup set are the symbols that are supposed to be queried outside this jitdylib. Instead of just letting them through, we issue another lookup invocation and fetch the allocated addresses, and then create jitlink graph containing `__imp_` GOT symbols and jump-thunks targetting the fetched addresses.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D131833
2022-08-16 02:06:57 +09:00
Sanjay Patel e5748c6e73 [InstCombine] reduce sub-with-overflow ==/!= 0
The basic patterns look like this:
https://alive2.llvm.org/ce/z/MDj9EC

The tests have a use of the overflow value too.
Otherwise, existing folds should reduce already.

This was noted as a missing IR fold in:
926e7312b2

Hopefully, this makes it easier to implement a backend
fix because we should get the same IR regardless of
whether the source used builtins or inline code.
2022-08-15 13:03:51 -04:00
Craig Topper 7a73ab5818 [RISCV] Enable isTruncateFree in SDAG for i64->i32 on rv64.
We have a good selection of W instructions, so promoting a truncated
value back to i64 is often free.

This appears to be a net code size reduction on SPECINT2006.

This has been split from D130397 as one of the patches needed to
complete that.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D131819
2022-08-15 08:32:51 -07:00
Craig Topper ef8c34e954 [InstSimplify] sle on i1 also encodes implication
We already support SGE, so the same logic should hold for SLE with
the LHS and RHS swapped.

I didn't see this in the wild. Just happened to walk past this code
and thought it was odd that it was asymmetric in what condition
codes it handled.

Reviewed By: spatel, reames

Differential Revision: https://reviews.llvm.org/D131805
2022-08-15 08:27:23 -07:00
Simon Pilgrim a7b85e4c0c [X86] Freeze shl(x,1) -> add(x,x) vector fold (PR50468)
Vector fold shl(x,1) -> add(freeze(x),freeze(x)) to avoid the undef issues identified in PR50468

Differential Revision: https://reviews.llvm.org/D106675
2022-08-15 16:17:21 +01:00
Simon Pilgrim 41bdb8cd36 [X86] Fold insert_vector_elt(undef, elt, 0) --> scalar_to_vector(elt)
I had hoped to make this a generic fold in DAGCombine, but there's quite a few regressions in Thumb2 MVE that need addressing first.

Fixes regressions from D106675.
2022-08-15 14:56:30 +01:00
David Green dfc95bab07 [DAG] Ensure more Legal BUILD_VECTOR elements types in shuffle->And combine
This is a followup to D131350, which caused another problem for i64
types being split into i32 on i32 targets. This patch tries to make sure
that either Illegal types are OK, or that the element types of a
buildvector are legal and bigger than or equal to the size of the
original elements.

Differential Revision: https://reviews.llvm.org/D131883
2022-08-15 14:41:45 +01:00
Luo, Yuanke 853bb192c4 Revert "(Reland) [fastalloc] Support allocating specific register class in fastalloc"
This reverts commit 30f9e6ebd3.
2022-08-15 20:33:15 +08:00
Ayke van Laethem a560e57a7e
[AVR] Only push and clear R1 in interrupts when necessary
R1 is a reserved register, but LLVM gives the APIs to know when it is
used or not. So this patch uses these APIs to only save/clear/restore R1
in interrupts when necessary.

The main issue here was getting inline assembly to work. One could argue
that this is the job of Clang, but for consistency I've made sure that
R1 is always usable in inline assembly even if that means clearing it
when it might not be needed.

Information on inline assembly in AVR can be found here:

https://www.nongnu.org/avr-libc/user-manual/inline_asm.html#asm_code

Essentially, this seems to suggest that r1 can be freely used in avr-gcc
inline assembly, even without specifying it as an input operand.

Differential Revision: https://reviews.llvm.org/D117426
2022-08-15 14:29:38 +02:00
Ayke van Laethem 43a8dbc5be
[AVR] Use @earlyclobber instead of register scavenging
The code to support the case when the register allocator has assigned
the same register to the src and the dst register operand isn't actually
needed:

  * LDWRdPtr and LDDWRdPtrQ have an @earlyclobber on the output
    register, so the register allocator will make sure to allocate a
    different register for the output register.
  * LDDWRdYQ does not have an @earlyclobber, but the pointer register is
    the fixed Y register which is reserved. The register allocator won't
    use reserved registers for the output value.

This removes a special case in the code that makes the pseudo
instruction expansion pass more complicated than it needs to be.

Differential Revision: https://reviews.llvm.org/D131844
2022-08-15 14:29:38 +02:00
Ayke van Laethem de48717fcf
[AVR] Support unaligned store
This patch really just extends D39946 towards stores as well as loads.
While the patch is in SelectionDAGBuilder, it only applies to AVR (the
only target that supports unaligned atomic operations).

Differential Revision: https://reviews.llvm.org/D128483
2022-08-15 14:29:37 +02:00
Max Kazantsev 354fa0b480 Revert "[SCEV] Use context to strengthen flags of BinOps"
This reverts commit 34ae308c73.

Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.
2022-08-15 18:51:59 +07:00
Simon Pilgrim 3a73133217 [DAG] canCreateUndefOrPoison - add freeze(sign_extend_inreg(x,vt)) -> sign_extend_inreg(freeze(x),vt) support
Guaranteed not to create undef/poison
2022-08-15 12:18:59 +01:00
Peter Waller 6e85db7293 [DAGCombine] Combine signext_inreg of extract-extend
The outer signext_inreg is redundant in the following:

  Fold (signext_inreg (extract_subvector (zext|anyext|sext iN_value to _) _) from iN)
       -> (extract_subvector (signext iN_value to iM))

Tests are precommitted and clone those by analogy from the AND case in
the same file. Add a negative test to check extension width is handled
correctly.

This patch supersedes D130700.

Differential Revision: https://reviews.llvm.org/D131503
2022-08-15 10:58:07 +00:00
Simon Pilgrim 7e294e676e [DAG] canCreateUndefOrPoison - add freeze(assertsext/zext(x,bt)) -> assertsext/zext(freeze(x),vt) support
These are guaranteed not to create undef/poison (although they may pass through) - the associated ISD::VALUETYPE node is also guaranteed never to generate poison
2022-08-15 11:13:43 +01:00
Benjamin Kramer 982779230f
Make demangler independent of LLVM again
The demangler is not supposed to include bits of LLVM, so it can't use STLExtras.

This undoes part of 6d9cd9199a
2022-08-15 11:44:28 +02:00
Fangrui Song d797c2ffdb [DebugInfo] -fdebug-prefix-map: handle '#line "file"' for asm source
`getContext().setMCLineTableRootFile` (from D62074) sets `RootFile.Name` to
`FirstCppHashFilename`. `RootFile.Name` is not processed by -fdebug-prefix-map
and will go to DW_TAG_compile_unit's DT_AT_name and DW_TAG_label's
DW_AT_decl_file. Remap `RootFile.Name`.

Fix another issue reported by https://github.com/llvm/llvm-project/issues/56609

Reviewed By: #debug-info, dblaikie, raj.khem

Differential Revision: https://reviews.llvm.org/D131848
2022-08-14 20:58:23 -07:00
Kazu Hirata f5a68feab3 Use llvm::none_of (NFC) 2022-08-14 16:25:39 -07:00
Kazu Hirata 6d9cd9199a Use llvm::all_of (NFC) 2022-08-14 16:25:36 -07:00
Krzysztof Parzyszek 40ba78679d [Hexagon] Distribute disjoint intervals at the end of expand-condsets
This fixes https://github.com/llvm/llvm-project/issues/56050.
2022-08-14 16:15:23 -05:00
Krzysztof Parzyszek 98bd252432 [Hexagon] Make some loops in HexagonExpandCondsets.cpp range-based, NFC
Plus some readability changes.
2022-08-14 16:15:06 -05:00
Nuno Lopes 0299ebc1bd InstCombine: use poison instead of undef as placeholder in insertvalue [NFC]
These vectors are fully initialized so the placeholder value is irrelevant
2022-08-14 21:37:23 +01:00
Kazu Hirata 50724716cd [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-14 12:51:58 -07:00
Kazu Hirata 9144e49334 [Support] Drop unnecessary const from a return type (NFC)
Identified with readability-const-return-type.
2022-08-14 12:51:56 -07:00
Lang Hames 8b9b45ce54 [JITLink] Fix some missing std::moves.
This should fix failures on some bots due to 1cf81274f4
(e.g. https://lab.llvm.org/buildbot#builders/196/builds/16684)
2022-08-14 11:42:26 -07:00
Simon Pilgrim cc6d3f07f4 [M68k] Fix MSVC llvm::Optional<> deprecation warnings
Use has_value()/value() instead of hasValue()/getValue()
2022-08-14 18:54:41 +01:00
Lang Hames 1cf81274f4 [JITLink] Add eh-frame CFI inspector, fix crash on malformed FDEs.
Add a fix to check that FDE pc-begin targets are defined before calling
getBlock (which will crash if the target is not defined). FDE pc-begins
pointing at undefined symbols are expected to arise only in obscure
circumstances (malformed objects, or removal of targets by JITLink
passes), but we want to handle them gracefully. With this patch the
FDE will be retained, but without any keepalive edge to it. Unless
some pass takes action to mark it as live it will be dead-stripped.

To make it easier for passes to connect FDEs to their targets a new
EHFrameCFIBlockInspector utility is added. This allows clients to
quickly determine whether a CFI record is a CIE or an FDE (assuming
that it's valid), and retrieve any personality, pc-begin, cie, or
LSDA edges associated with it.
2022-08-14 10:49:26 -07:00
Simon Pilgrim 8b47e29fa0 [X86] combineVectorShiftImm - fold (shl (add X, X), C) -> (shl X, (C + 1))
Noticed while investigating the regressions in D106675
2022-08-14 17:42:02 +01:00
Simon Pilgrim e2d13fd096 [DAG] canCreateUndefOrPoison - add freeze(shl(x,y)) -> shl(freeze(x),y) support
These are guaranteed not to create undef/poison if the shift amount is known to be in range
2022-08-14 14:38:10 +01:00
Simon Pilgrim a621d38bcb [DAG] canCreateUndefOrPoison - add freeze(and/or/xor(x,y)) -> and/or/xor(freeze(x),y) support
These are guaranteed not to create undef/poison
2022-08-14 13:14:53 +01:00
Anubhab Ghosh 23d0e71fcb [Orc] Use IntervalMap to store free memory regions in MapperJITLinkMemoryManager
MapperJITLinkMemoryManager uses a free list to keep track of available
memory regions. Using an IntervalMap instead of vector allow automatic
coalescing of memory regions as they are freed.

Differential Revision: https://reviews.llvm.org/D131831
2022-08-14 14:35:08 +05:30
Phoebe Wang 8b69549dc5 [X86][FP16] Promote FP16->[U]INT to FP16->FP32->[U]INT
This is to avoid f16->i64 being lowered to `__fixhfdi/__fixunshfdi` on 32-bits since neither libgcc nor compiler-rt provide them. https://godbolt.org/z/cjWEsea5v

It also helps to improve the performance by promoting the vector type.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D131828
2022-08-14 09:37:33 +08:00
Vitaly Buka f1596952f9 [AArch64] Fix signed integer overflow in CSINC case
https://lab.llvm.org/staging/#/builders/224/builds/2/steps/16/logs/stdio

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D131815
2022-08-13 13:12:09 -07:00
Simon Pilgrim 60534b8879 [DAG] canCreateUndefOrPoison - add freeze(add/sub/mul(x,y)) -> add/sub/mul(freeze(x),y,z) support
These are guaranteed not to create undef/poison as long as there are no poison generating flags
2022-08-13 20:58:00 +01:00
Kazu Hirata 448c466636 Use llvm::erase_value (NFC) 2022-08-13 12:55:50 -07:00
Kazu Hirata 109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Kazu Hirata 2117fcb1c0 Use Optional::transform instead of Optional::map (NFC)
I'm planning to deprecate map in favor of transform for consistency
with std::optional::transform in C++23.
2022-08-13 11:48:26 -07:00
Florian Hahn c2af37dcdb
Revert "[AArch64][GlobalISel] Recognise some CCMPri"
This reverts commit 38c2366b3f.

This patch seems to break boostraping LLVM with `-fglobal-isel -O3`
on AArch64 hardware. Without the revert, there are 500+ test
failures for the `check-llvm-codegen-x86` target.
2022-08-13 17:44:41 +01:00
Sanjay Patel 8b56fa92de [InstCombine] fix "X|(X^Y)" pattern-matching for commuted variants 2022-08-13 11:02:28 -04:00
Sanjay Patel 9d218b61cc [InstCombine] reduce or-xor-or patterns
(A | ?) | (A ^ B) --> (A | ?) | B
https://alive2.llvm.org/ce/z/dbNQw4

This extends the existing transform to peek through
another 'or' instruction for the common operand.

This is the underlying missing fold that should allow
issue #56711 and issue #57120 to reduce even more.
2022-08-13 09:52:01 -04:00
Sanjay Patel 763b31237f [InstCombine] move comments closer to relevant code; NFC 2022-08-13 09:16:33 -04:00
Liqin.Weng 8a12606a7e [AVR] Remove debug location of spill/reload instructions
Reviewed By: MatzeB, benshi001

Differential Revision: https://reviews.llvm.org/D129262
2022-08-13 20:58:12 +08:00
LiaoChunyu 99ef0ddea3 [RISCV] Fold (sub constant, (setcc x, y, eq/neq)) -> (add constant - 1, (setcc x, y, neq/eq))
(setcc x, y, eq/neq) are seqz, snez that set rd = 0/1.

addi is used to process immediate, which can save instructions for load immediate.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131471
2022-08-13 20:37:57 +08:00
Fangrui Song e9b213131a [Support] computeHostNumPhysicalCores: use sched_getaffinity for all non-Android Linux with no custom implementation
Make the sched_getaffinity based implementation available to all architectures
(except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)`
code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`.

The function is mainly used by in-process ThinLTO to decide the default number
of threads. Returning -1 will use just one thread.

Android is excluded because of the higher API level requirement:
`sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`
2022-08-13 01:36:13 -07:00
Anubhab Ghosh a31af32183 Reapply [Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager
When memory is deallocated from MapperJITLinkMemoryManager deinitialize
actions are run through mapper and in case of InProcessMapper, memory
protections of the region are reset to read/write as they were previously
changed and can be reused in future.

Differential Revision: https://reviews.llvm.org/D131768
2022-08-13 13:07:50 +05:30
Luo, Yuanke 30f9e6ebd3 (Reland) [fastalloc] Support allocating specific register class in fastalloc
Reland commit 719658d078

The base RA support infrastructure that only allow a specific register
class be allocated in RA pss. Since greedy RA, basic RA derived from
base RA, they all allow allocating specific register class. Fast RA
doesn't support allocating register for specific register class. This
patch is to enable ShouldAllocateClass in fast RA, so that it can
support allocating register for specific register class.

Differential Revision: https://reviews.llvm.org/D131825
2022-08-13 13:57:34 +08:00
Craig Topper 37db283362 [RISCV] isImpliedByDomCondition returns an Optional<bool> not a bool.
We were incorrectly checking that it returned an implicaton result,
not that the implication result itself was true.
2022-08-12 22:21:05 -07:00
Sunho Kim 50f305017d [ORC] Silence copy elision warning. 2022-08-13 14:17:43 +09:00
Sunho Kim 7332b18fa7 [ORC] Specify the typename. 2022-08-13 13:58:50 +09:00
Anubhab Ghosh 8180105143 Revert "[Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager"
This reverts commit 143555b2ed.
2022-08-13 10:22:31 +05:30
Sunho Kim 9189a26664 [ORC_RT][COFF] Initial platform support for COFF/x86_64.
Initial platform support for COFF/x86_64.

Completed features:
* Statically linked orc runtime.
* Full linking/initialization of static/dynamic vc runtimes and microsoft stl libraries.
* SEH exception handling.
* Full static initializers support
* dlfns
* JIT side symbol lookup/dispatch

Things to note:
* It uses vc runtime libraries found in vc toolchain installations.
* Bootstrapping state is separated because when statically linking orc runtime it needs microsoft stl functions to initialize the orc runtime, but static initializers need to be ran in order to fully initialize stl libraries.
* Process symbols can't be used blidnly on msvc platform; otherwise duplicate definition error gets generated. If process symbols are used, it's destined to get out-of-reach error at some point.
* Atexit currently not handled -- will be handled in the follow-up patches.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D130479
2022-08-13 13:48:40 +09:00
Anubhab Ghosh 143555b2ed [Orc] Properly deallocate mapped memory in MapperJITLinkMemoryManager
When memory is deallocated from MapperJITLinkMemoryManager deinitialize
actions are run through mapper and in case of InProcessMapper, memory
protections of the region are reset to read/write as they were previously
changed and can be reused in future.

Differential Revision: https://reviews.llvm.org/D131768
2022-08-13 10:08:25 +05:30
Joe Loser b12aa497cd
[DAGCombine] Replace std::monostate equivalent in DAGCombiner.cpp
Remove the `UnitT` type and operators in favor of using `std::monostate`
directly.

Differential Revision: https://reviews.llvm.org/D131778
2022-08-12 21:42:09 -06:00
jacquesguan 0fe5f03eeb [RISCV][NFC] Use nested namespace definations.
Since we use C++17 now, we could use nested namespace definations to simplify code.

Differential Revision: https://reviews.llvm.org/D131751
2022-08-13 09:56:59 +08:00
Fangrui Song 3329cec2f7 [DebugInfo] Don't join DW_AT_comp_dir and directories[0] for DWARF v5 line tables
DWARF v5 6.2.4 The Line Number Program Header says:

> The first entry is the current directory of the compilation. Each additional
> path entry is either a full path name or is relative to the current directory of
> the compilation.

When forming a path, relative DW_AT_comp_dir and directories[0] are not supposed
to be joined together. Fix getFileNameByIndex to special case DWARF v5 DirIdx == 0.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D131804
2022-08-12 14:01:52 -07:00
James Y Knight 4d7f9b7489 X86: Don't fold TEST into ADD ...@GOTTPOFF/GOTNTPOFF/INDNTPOFF
The linker may convert such an ADD into a LEA, so we must not
use the EFLAGS output.

This causes miscompiles with -fsanitize=null after
bacdf80f42 added
llvm.threadlocal.address -- previously, global variables were known to
be non-null, but the intrinsic is not currently known to return
nonnull. (That should be corrected, but it shouldn't've caused
miscompiles!)

Differential Revision: https://reviews.llvm.org/D131716
2022-08-12 20:52:00 +00:00
Fangrui Song f62e60fb23 [MCDwarf] Respect -fdebug-prefix-map= for generated assembly debug info (DWARF v5)
For generated assembly debug info, MCDwarfLineTableHeader::CompilationDir is an
unmapped path set in MCContext::setGenDwarfRootFile. Remap it.

A relative destination path of -fdebug-prefix-map= exposes a llvm-dwarfdump bug
which joins relative DW_AT_comp_dir and directories[0].

Fix https://github.com/llvm/llvm-project/issues/56609

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D131749
2022-08-12 12:52:36 -07:00
Ilia Diachkov df8713079b [SPIRV] support capabilities and extensions
This patch supports SPIR-V capabilities and extensions. In addition,
it inserts decorations related to MIFlags and improves support of switches.
Five tests are included to demonstrate the improvement.

Differential Revision: https://reviews.llvm.org/D131221

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-08-12 23:33:15 +03:00
Kevin Athey 532564de17 [MSAN] add flag to suppress storage of stack variable names with -sanitize-memory-track-origins
Allows for even more savings in the binary image while simultaneously removing the name of the offending stack variable.

Depends on D131631

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D131728
2022-08-12 11:59:53 -07:00
Ben Langmuir 79f34ae7fe [llvm] Fix assertion when stat fails in remove_directories
We were dereferencing an empty Optional if IgnoreErrors was true and the
stat failed.

rdar://60887887

Differential Revision: https://reviews.llvm.org/D131791
2022-08-12 11:32:04 -07:00
Wolfgang Pieb 7ddfb4dfeb [Inlining] Introduce the function attribute "inline-max-stacksize"
The value of the attribute is a size in bytes. It has the effect of
suppressing inlining of functions whose stacksizes exceed the given value.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D129904
2022-08-12 11:07:18 -07:00
Arthur Eubanks a3ac1cfaed [SampleProfile] Fix non-determinism in promoteMergeNotInlinedContextSamples()
We're seeing non-determinism with loading sample profiles. It seems to
be related to the order in which we merge FunctionSamples in
promoteMergeNotInlinedContextSamples(). Use a MapVector to iterate over
NonInlinedCallSites in the order entries were inserted.

Reviewed By: wenlei, davidxl

Differential Revision: https://reviews.llvm.org/D131592
2022-08-12 10:13:25 -07:00
James Y Knight 20451cb06b Update license on Unicode.org's ConvertUTF code.
The code was relicensed by its owner (Unicode.org) a long time back,
but we still had the old (problematic) license in our fork.

Note that the source files have not been distributed from unicode.org
since 2009 (due to being buggy and unmaintained upstream), but they
were given this license before that.

Fixes https://github.com/llvm/llvm-project/issues/32309

Differential Revision: https://reviews.llvm.org/D66390
2022-08-12 16:51:08 +00:00
Kevin Athey ec277b67eb [MSAN] Separate id ptr from constant string for variable names used in track origins.
The goal is to reduce the size of the MSAN with track origins binary, by making
the variable name locations constant which will allow the linker to compress
them.

Follows: https://reviews.llvm.org/D131415

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D131631
2022-08-12 08:47:36 -07:00
James Y Knight 59351fe340 SPIRV: Fix compilation in NDEBUG. 2022-08-12 14:00:39 +00:00
Simon Pilgrim 4de35f4bbf [DAG] Add TODO to remove creation of INSERT_SUBVECTOR nodes from SimplifyMultipleUseDemandedBits
SimplifyMultipleUseDemandedBits shouldn't be creating general nodes like this - although we allow bitcasts, even general constant folding is avoided.

Removing it causes a number of regressions that need addressing first, but I've added a TODO for now.
2022-08-12 10:45:30 +01:00
Filipp Zhinkin 1626ee6a95 [DAGCombine] Hoist shifts out of a logic operations tree.
Hoist and combine shift operations from logic operations tree:
logic (logic (SH x0, s), y), (logic (SH x1, s), z)  --> logic (SH (logic x0, x1), s), (logic y, z)

The transformation improves code generated for some cases related to the issue https://github.com/llvm/llvm-project/issues/49541.

Correctness:
https://alive2.llvm.org/ce/z/pVqVgY
https://alive2.llvm.org/ce/z/YVvT-q
https://alive2.llvm.org/ce/z/W5zTBq
https://alive2.llvm.org/ce/z/YfJsvJ
https://alive2.llvm.org/ce/z/3YSyDM
https://alive2.llvm.org/ce/z/Bs2kzk
https://alive2.llvm.org/ce/z/EoQpzU
https://alive2.llvm.org/ce/z/Jnc_5H
https://alive2.llvm.org/ce/z/_LP6k_
https://alive2.llvm.org/ce/z/KvZNC9

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D131189
2022-08-12 12:42:16 +03:00
Max Kazantsev a3d1fb3b59 [SCEV] Prove condition invariance via context
Contextual knowledge may be used to prove invariance of some conditions.
For example, in this case:
```
  ; %len >= 0
  guard(%iv = {start,+,1}<nuw> <s %len)
  guard(%iv = {start,+,1}<nuw> <u %len)
```
the 2nd check always fails if `start` is negative and always passes otherwise.

It looks like there are more opportunities of this kind that are still to be
implemented in the future.

Differential Revision: https://reviews.llvm.org/D129753
Reviewed By: apilipenko
2022-08-12 14:23:35 +07:00
wanglian 061f7ec9fa [LegalizeTypes][NFC] Use getConstantOperandVal instead of cast constant getvalue
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131642
2022-08-12 14:35:10 +08:00
wanglian 1303057888 [LegalizeTypes][NFC] Use dyn_cast instead of isa and cast
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131544
2022-08-12 14:18:49 +08:00
gonglingqin 9e09c3186e [LoongArch] Add codegen support for ISD::CTPOP, ISD::CTTZ and ISD::CTLZ
Differential Revision: https://reviews.llvm.org/D131550
2022-08-12 14:15:30 +08:00
Ting Wang 12e1936f64 [PowerPC] Add XXEVAL TD pattern
Add xxeval TD pattern for P10 on: eqv, nor, or, xor.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D131654
2022-08-12 01:27:24 -04:00
Fangrui Song b0c4cd35df [MCDwarf] Use emplace to avoid move assignment. NFC 2022-08-12 05:05:49 +00:00
Chuanqi Xu e190b7cc90 [Coroutines] Maintain the position of final suspend
Closing https://github.com/llvm/llvm-project/issues/56329

The problem happens when we try to simplify the suspend points. We might
break the assumption that the final suspend lives in the last slot of
Shape.CoroSuspends. This patch tries to main the assumption and fixes
the problem.
2022-08-12 13:05:08 +08:00
Chen Zheng 8d19cfb72e [PowerPC] omit location attribute for TLS variable on AIX
TLS debug on AIX is not ready for now.
The location generated in no-integrated-as mode is wrong and
in integrated-as mode causes AIX linker error.

Reviewed By: Esme

Differential Revision: https://reviews.llvm.org/D130245
2022-08-12 00:54:48 -04:00
Weining Lu 40f1f9b357 [LoongArch] Return null SDValue by default in LowerOperation. NFC
Differential Revision: https://reviews.llvm.org/D131546
2022-08-12 12:09:08 +08:00
wanglian 3b71f1d5ab [LegalizeTypes][NFC] Use getConstantOperandAPInt instead of cast constant getAPInt
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D131653
2022-08-12 10:21:54 +08:00
Mircea Trofin 3486b1b736 [mlgo][nfc] regalloc test model generator: prep for TFLite
Casting operator to make TFLite happy.

Reviewed By: yundiqian

Differential Revision: https://reviews.llvm.org/D131584
2022-08-11 15:53:23 -07:00
Craig Topper e493944f5f [RISCV] Use SLTIU X, -1 for (setne X, -1).
Since -1 is the maximum unsigned value, all values less than it
are not equal to it.
2022-08-11 15:36:04 -07:00
Martin Storsjö 2c2fb0c737 [llvm] Use hidden visibility when building for MinGW with Clang
Since c5b3de6745 (git main,
August 11th), Clang does generate working hidden visibility
on MinGW targets. Using that reduces the number of exports from
a dylib build of LLVM significantly, which is vital for fitting
within the limit of 64k exported symbols from a DLL.

It's essential that if we set CMAKE_CXX_VISIBILITY_PRESET=hidden
(which passes -fvisibility=hidden on the command line), we also
must define LLVM_EXTERNAL_VISIBILITY consistently to override
it. (If there are mismatches, e.g. setting hidden visibility generally
but never overriding it back to default for the symbols that do need
to be exported, we'd get broken builds in such configurations.)

We don't want to be using __attribute__((visibility("hidden"))) on
MinGW with GCC, because GCC produces a warning about it. (GCC hasn't
warned about the command line options that set hidden visibility
though.) Clang has historically not warned about either of them, so
it is harmless to use the hidden visibility when building with older
Clang (so we don't need to detect the exact version of Clang/LLVM where
it has an effect).

This reduces the number of exported symbols for a dylib build of LLVM;
previously libLLVM exported around 64650 symbols (when the maximum is
65536) when the ARM, AArch64 and X86 targets were enabled. If enabling
more targets (or if building with e.g. assertions enabled), it would
exceed the limit. Now with visibility flags in use, the same build
with ARM, AArch64 and X86 ends up at around 35k exported symbols.

Differential Revision: https://reviews.llvm.org/D131661
2022-08-12 00:57:05 +03:00
Craig Topper 2c79801a0e [RISCV] Add more ineg+setcc isel patterns to avoid creating neg+xori+slti(u).
Including patterns to select addiw if only the lower 32 bits are used.

I'm not excited about adding this many patterns. I'm looking at whether
we can create the xori during lowering and move the ineg patterns to
DAGCombiner.
2022-08-11 14:24:09 -07:00
Fangrui Song 57f334d817 [Support] Remove Log2 workaround for Android API level < 18
The function added by D9467 is unneeded.
https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has
moved forward to at least a minimum target API of 19.

Reviewed By: srhines

Differential Revision: https://reviews.llvm.org/D131656
2022-08-11 17:39:41 +00:00
Simon Pilgrim 6ba5fc2dee [X86] lowerShuffleWithVPMOV - support direct lowering to VPMOV on VLX targets
lowerShuffleWithVPMOV currently only matches shuffle(truncate(x)) patterns, but on VLX targets the truncate isn't usually necessary to make the VPMOV node worthwhile (as we're only targetting v16i8/v8i16 shuffles we're almost always ending up with a PSHUFB node instead). PACKSS/PACKUS are still preferred vs VPMOV due to their lower uop count.

Fixes the remaining regression from the fixes in rG293899c64b75
2022-08-11 17:40:07 +01:00
Kevin P. Neal de64d0076e [FPEnv][InstSimplify] Fix formatting error.
My most recent change for D131607 had a formatting error that I didn't
notice until after I committed it. Let me fix it now so changes to this
file will be back-to-back from me.
2022-08-11 12:10:05 -04:00
Sanjay Patel fa68d93d54 [InstCombine] fold reassociative fadd with negated operand
We manage to iteratively achieve this result with no extra
uses, and the reassociate pass can also do this, but this
pattern falls through the cracks in the example from
issue #57053.
2022-08-11 11:43:36 -04:00
Kevin P. Neal 7bdb010d7c [FPEnv][InstSimplify] 0.0 - -X ==> X
Another ticket split out of D107285, this extends the optimization
of 0.0 - -X to just X when using constrained intrinsics and the
optimization is allowed.

If the negation of X is done with fsub then the match fails because of
the lack of IR Matcher support for constrained intrinsics.

While I'm here, remove some TODO notices since the work is no longer
planned.

Differential Revision: https://reviews.llvm.org/D131607
2022-08-11 11:35:33 -04:00
Simon Pilgrim 08a880509e [X86] Add RDPRU instruction CPUID bit masks
As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction

AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4
2022-08-11 16:07:36 +01:00
Peter Waller 898699831b [DAGCombine] Check zext legality in zext-extract-extend combine
Discussed in D131503.

Fix to D130782.
2022-08-11 14:30:42 +00:00
Eric Astor 94fae7a581 [ms] [llvm-ml] Add support for nested PROC/ENDP pairs
This is believed to match behavior by ML.EXE and ML64.EXE.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D131522
2022-08-11 14:19:02 +00:00
Anubhab Ghosh 0aaa74f7e6 [Orc] Reorder operations in ExecutorSharedMemoryMapperService shutdown
Differential Revision: https://reviews.llvm.org/D131510
2022-08-11 19:34:10 +05:30
David Green a9e9dd9a3a [AArch64] Add bf16 select handling
A bfloat select operation will currently crash, but is allowed from C.
This adds handling for the operation, turning it into a FCSELHrrr if
fullfp16 is present, or converting it to a FCSELSrrr if not. The
FCSELSrrr is created via using INSERT_SUBREG/EXTRACT_SUBREG to convert
the bf16 to a f32 and using the f32 pattern for FCSELSrrr. (I originally
attempted to do this via a tablegen pattern, but it appears that the
nzcv glue is places onto the wrong node, causing it to be forgotten and
incorrect scheduling to be emitted).

The FCSELSrrr can also be used for fp16 selects when +fullfp16 is not
present, which helps avoid an unnecessary promotion to f32.

Differential Revision: https://reviews.llvm.org/D131253
2022-08-11 14:20:36 +01:00
Simon Pilgrim 5dcf0c342b [X86] lowerShuffleWithVPMOV - remove oneuse constraints on shuffle(trunc(x),undef) -> vpmov(x) lowering
These were added in rG057bdd63 but shuffle combining has gotten a lot better at folding different vector widths since then.
2022-08-11 14:06:42 +01:00
David Stuttard 1d1cc05539 AMDGPU: mbcnt allow for non-zero src1 for known-bits
Src1 for mbcnt can be a non-zero literal or register. Take this into account
when calculating known bits.

Differential Revision: https://reviews.llvm.org/D131478
2022-08-11 13:23:43 +01:00
Yeting Kuo 875694089d [RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics.
The patch uses peephole method to fold merge.vvm and unmasked intrinsics to
masked intrinsics. Using peephole intead of tablegen patterns is to avoid large
auto gnerated code.

Note: The patch ignores segment loads since I don't know how to test them.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130442
2022-08-11 17:58:11 +08:00
Ilya Biryukov 7c80c4d677 [MC] NFC. Avoid redundant copies when constructing StructFieldInfo
Follow-up after D131595, see comments in the review thread.

The intention of having two constructors was to minimize the copies of
`vector`, but a lack of `std::move` on the call site caused the wrong
constructor to be called.

Switched to a single constructor that accepts a value.
Accepting by value allows to have a single constructor and still decide
to copy or move on the call site.
2022-08-11 11:53:24 +02:00