llvm-project

Commit Graph

Author	SHA1	Message	Date
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Daniil Fukalov	3489c2d7b1	[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen, kparzysz Differential Revision: https://reviews.llvm.org/D101533	2021-04-30 22:51:51 +03:00
Guozhi Wei	b817ea7b17	[MachineFunction] Make comment for TracksLiveness more clearer As discussed in https://lists.llvm.org/pipermail/llvm-dev/2021-April/150225.html, the current comments for TracksLiveness property and isKill flag are confusing. This patch makes the comments more clearer. Differential Revision: https://reviews.llvm.org/D101500	2021-04-30 12:10:36 -07:00
Scott Linder	f3026d8b8d	[ADT] Add llvm::remove_cvref and llvm::remove_cvref_t Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100669	2021-04-30 18:22:38 +00:00
Scott Linder	c6f20d70a8	[ADT] Add STLForwardCompat.h and llvm::disjunction Move some types in STLExtras.h which are named and behave identically to STL types from future standards into a dedicated header. This keeps them organized (they are not "extras" in the same sense as most types in STLExtras.h are) and fixes circular dependencies in future patches. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100668	2021-04-30 17:28:47 +00:00
Paul C. Anagnostopoulos	985ab6e1fa	[TableGen] Fix two bugs in 'defm' when complex 'assert' is involved. This patch fixes two bugs that arise when a 'defm' inherits from a multiclass and also from a class with assertions. Differential Revision: https://reviews.llvm.org/D101626	2021-04-30 11:31:06 -04:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Dominik Montada	97ed1b6036	[GISel] Teach TableGen to check predicates of immediate operands in patterns Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D91703	2021-04-30 10:18:45 +02:00
Matt Arsenault	1cf3d68f97	VirtRegMap: Add pass option to not clear virt regs In a future change it will be possible to run register allocation with a specific set of register classes, so some of the remaining virtual registers will still be meaningful.	2021-04-29 21:08:47 -04:00
Zequan Wu	cab48e2f0e	[CodeGen] don't emit addrsig symbol if it's used only by metadata Value only used by metadata can be removed from .addrsig table. This solves the undefined symbol error when enabling addrsig table on COFF LTO. Differential Revision: https://reviews.llvm.org/D101512	2021-04-29 15:39:30 -07:00
Amara Emerson	96ec6d91e4	[AArch64][GlobalISel] Simplify out of range rotate amount. Differential Revision: https://reviews.llvm.org/D101005	2021-04-29 14:05:58 -07:00
Alexey Bataev	12c51f2358	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 12:48:00 -07:00
Alexey Bataev	6e859f3cd4	Revert "[COST] Improve shuffle kind detection if shuffle mask is provided." This reverts commit `9239932221` to fix a compiler crash on mask checks.	2021-04-29 12:40:33 -07:00
Victor Huang	ae3377c553	[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects - Add new variantKinds for the symbol's variable offset and region handle - Print the proper relocation specifier @gd in the asm streamer when emitting the TC Entry for the variable offset for the symbol - Fix the switch section failure between the TC Entry of variable offset and region handle - Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D100956	2021-04-29 13:18:59 -05:00
Benjamin Kramer	df323ba445	Revert "[X86] Support AMX fast register allocation" This reverts commit `3b8ec86fd5`. Revert "[X86] Refine AMX fast register allocation" This reverts commit `c3f95e9197`. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Alexey Bataev	9239932221	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 09:42:56 -07:00
Sanjay Patel	f4b1272d3d	[ADT] fix typo in code block comment; NFC	2021-04-29 12:09:22 -04:00
Anirudh Prasad	ded0a70aeb	[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS - Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter) - However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch) - The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not. - It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`) Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100975	2021-04-29 11:58:54 -04:00
Sander de Smalen	51d648c119	Revert "[LV] Calculate max feasible scalable VF." Temporarily reverting this patch due to some unexpected issue found by one of the PPC buildbots. This reverts commit `584e9b6e4b`.	2021-04-29 16:04:37 +01:00
Chirag Khandelwal	fbd3548d1c	[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder This patch adds section support in the OpenMP IRBuilder module, along with a test for the same. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D89671	2021-04-29 18:39:49 +05:30
Amara Emerson	d138e97c2a	[GlobalISel] Bump CallLoweringInfo::OrigArgs initial size to 32. NFC. We spend some time during sqlite3 compilation regrowing this vector, bump it up to avoid this. Gives around 1-2% improvement in codegen-only time for sqlite3 at -O0.	2021-04-29 01:01:29 -07:00
Evgeny Leviant	6a0283d0d2	[NewPM] Add an option to dump pass structure Patch adds -debug-pass-structure option to dump pass structure when new pass manager is used. Differential revision: https://reviews.llvm.org/D99599	2021-04-29 10:29:42 +03:00
Bardia Mahjour	ddb3b26a12	[LV] Consider Loop Unroll Hints When Making Interleave Decisions This patch causes the loop vectorizer to not interleave loops that have nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)). Note that if a particular interleave count is being requested (through llvm.loop.interleave_count), it will still be honoured, regardless of the presence of nounroll hints. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101374	2021-04-28 17:27:52 -04:00
Anirudh Prasad	07b0a72d8e	[AsmParser][SystemZ][z/OS] Use updated framework in AsmLexer to accept special tokens as Identifiers - Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers. - These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ. - In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_\|$\|@\|#] are already supported as "part" of an Identifier. - The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100959	2021-04-28 15:43:24 -04:00
Jonas Devlieghere	625bd94c6d	[dsymutil] Add flag to force a static variable to keep its enclosing function Add a flag to change dsymutil's behavior and force a static variable to keep its enclosing function. The test shows a situation where that could be useful. I'm not convinced this behavior makes sense as a default, which is why it's behind a flag. rdar://74918374 Differential revision: https://reviews.llvm.org/D101337	2021-04-28 11:33:04 -07:00
David Candler	b8baa2a913	[ARM][AArch64] Require appropriate features for crypto algorithms This patch changes the AArch32 crypto instructions (sha2 and aes) to require the specific sha2 or aes features. These features have already been implemented and can be controlled through the command line, but do not have the expected result (i.e. `+noaes` will not disable aes instructions). The crypto feature retains its existing meaning of both sha2 and aes. Several small changes are included due to the knock-on effect this has: - The AArch32 driver has been modified to ensure sha2/aes is correctly set based on arch/cpu/fpu selection and feature ordering. - Crypto extensions are permitted for AArch32 v8-R profile, but not enabled by default. - ACLE feature macros have been updated with the fine grained crypto algorithms. These are also used by AArch64. - Various tests updated due to the change in feature lists and macros. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D99079	2021-04-28 16:26:18 +01:00
Paul C. Anagnostopoulos	952c6ddd8b	[TableGen] Add the !find bang operator !find searches a source string for a target string and returns the position. Differential Revision: https://reviews.llvm.org/D101318	2021-04-28 09:51:00 -04:00
Matt Arsenault	cea97fc0fc	GlobalISel: Relax verification of physical register copy types This was picking a concrete size for a physical register, and enforcing exact match on the virtual register's type size. Some targets add multiple types to a register class, and some are smaller than the full bit width. For example x86 adds f32 to 128-bit xmm registers, and AMDGPU adds i16/f16 to 32-bit registers. It might be better to represent these cases as a copy of the full register and an extraction of the subpart, but a lot of code assumes you can directly copy. This will help fix the current usage of the DAG calling convention infrastructure which is incompatible with how GlobalISel is now using it. The API is somewhat cumbersome here, but I just mirrored the existing functions, except now with LLTs (and allow returning null on failure, unlike the MVT version). I think the concept of selecting register classes based on type is flawed to begin with, but I'm trying to keep this compatible with the existing handling.	2021-04-28 08:45:41 -04:00
Sander de Smalen	584e9b6e4b	[LV] Calculate max feasible scalable VF. This patch also refactors the way the feasible max VF is calculated, although this is NFC for fixed-width vectors. After this change scalable VF hints are no longer truncated/clamped to a shorter scalable VF, nor does it drop the 'scalable flag' from the suggested VF to vectorize with a similar VF that is fixed. Instead, the hint is ignored which means the vectorizer is free to find a more suitable VF, using the CostModel to determine the best possible VF. Reviewed By: c-rhodes, fhahn Differential Revision: https://reviews.llvm.org/D98509	2021-04-28 12:30:00 +01:00
Benjamin Kramer	7e5682ee62	[ADT] Make TrackingStatistic's ctor constexpr This lets clang diagnose unused statistics, so remove them.	2021-04-28 12:00:17 +02:00
RamNalamothu	63cfab4f40	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-28 09:04:04 +05:30
Craig Topper	3067520bf4	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Roman Lebedev	15f631cc78	[NFC][IR] PHINode: ... and assert in another ctor too	2021-04-27 20:52:44 +03:00
Roman Lebedev	1ebbf84ba4	[NFC][IR] PHINode: assert we aren't trying to create token-typed PHI Verifier will complain, but by then it may be too late, because we might have never reached it because we already crashed with some bogus bug. It is best to catch this the moment it happens.	2021-04-27 20:49:42 +03:00
Nick Desaulniers	ea8416bf4d	[CodeGenOptions] make StackProtectorGuardOffset signed GCC supports negative values for -mstack-protector-guard-offset=, this should be a signed value. Pre-req to D100919. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101325	2021-04-27 10:12:58 -07:00
Victor Huang	241c2da406	[AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary This patch adds the support to restrict prefixed instruction from crossing the 64 byte boundary: - Add the infrastructure to register a custom XCOFF streamer - Add a custom XCOFF streamer for PowerPC to allow us to intercept instructions as they are being emitted and align all 8 byte instructions to a 64 byte boundary if required by adding a 4 byte nop. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D101107	2021-04-27 11:55:18 -05:00
Nico Weber	21da04f701	[llvm, clang] Remove stdlib includes from .h files without `std::` Found files not containing `std::` with: INCL="algorithm\|array\|list\|map\|memory\|queue\|set\|string\|utility\|vector\|unordered_map\|unordered_set" git ls-files llvm/include/llvm \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" > to_process.txt git ls-files clang/include/clang \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" >> to_process.txt Then removed these headers from those files with INCL_ESCAPED="$(echo $INCL\|sed 's/\|/\\\|/g')" cat to_process.txt \| xargs sed -i "/^#include <$$INCL_ESCAPED$>$/d" cat to_process.txt \| xargs sed -i '/^$/N;/^\n$/D' No behavior change. Differential Revision: https://reviews.llvm.org/D101378	2021-04-27 12:41:39 -04:00
Petar Avramovic	4c6eb3886c	[MIPatternMatch]: Add matchers for binary instructions Add matchers that support commutative and non-commutative binary opcodes. Differential Revision: https://reviews.llvm.org/D99736	2021-04-27 11:37:42 +02:00
Petar Avramovic	39662abf72	[MIPatternMatch]: Add mi_match for MachineInstr This utility allows more efficient start of pattern match. Often MachineInstr(MI) is available and instead of using mi_match(MI.getOperand(0).getReg(), MRI, ...) followed by MRI.getVRegDef(Reg) that gives back MI we now use mi_match(MI, MRI, ...). Differential Revision: https://reviews.llvm.org/D99735	2021-04-27 11:08:16 +02:00
Petar Avramovic	ebe408ad80	[MIPatternMatch]: Add ICstRegMatch Matches G_CONSTANT and returns its def register. Differential Revision: https://reviews.llvm.org/D99734	2021-04-27 10:53:17 +02:00
Petar Avramovic	0713c82b13	[GlobalISel]: Add a getConstantIntVRegVal utility Returns ConstantInt from G_CONSTANT instruction given its def register. Differential Revision: https://reviews.llvm.org/D99733	2021-04-27 10:52:07 +02:00
dfukalov	e4c606acaf	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Lang Hames	a702fa2a04	[ORC] Make LLVMOrcLLJITBuilderSetJITTargetMachineBuilder consume as advertised. This should fix some of the memory leaks seen in the ORC C API test case.	2021-04-26 22:26:38 -07:00
Chen Zheng	e5000eef81	[XCOFF] make .file directive have directory info The .file directive is changed to only have basename in D36018 for ELF. But on AIX, we require the .file directive to also contain the directory info. This aligns with other AIX compiler like XLC and is required by some AIX tool like DBX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D99785	2021-04-27 00:15:23 -04:00
Lang Hames	d122d80b3d	Reapply "[ORC] Add unit tests for parts of the ..." with fixes and improvements. This reapplies `8740360093`, which was reverted in `bbddadd46e` due to buildbot errors. This version checks that a JIT instance can be safely constructed, skipping tests if it can not be. To enable this it introduces new C API to retrieve and set the target triple for a JITTargetMachineBuilder.	2021-04-26 20:44:40 -07:00
William S. Moses	7aa3cad46a	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 20:12:12 -04:00
Fangrui Song	18839be9c5	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 mostly unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. For now this patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with `llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings. Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which calls `getName`/`getDesc`/`getValue`. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 16:47:32 -07:00
William S. Moses	8ede96493c	Revert "[NVPTX] Enable lowering of atomics on local memory" This reverts commit `fede99d386`.	2021-04-26 19:33:01 -04:00
William S. Moses	fede99d386	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 19:27:27 -04:00
Lei Zhang	254e289d45	Revert "[ADT] Remove StatisticBase and make NoopStatistic empty" This reverts commit `b540311781` because it breaks MLIR build: https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b	2021-04-26 18:31:04 -04:00
Fangrui Song	e01c666b13	Revert D76519 "[NFC] Refactor how CFI section types are represented in AsmPrinter" This reverts commit `0ce723cb22`. D76519 was not quite NFC. If we see a CFISection::Debug function before a CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`. We should use .eh_frame instead. This scenario is untested.	2021-04-26 15:17:28 -07:00
Lang Hames	c8fc5e3ba9	[ORC] C API updates. Adds support for creating custom MaterializationUnits in the C API with the new LLVMOrcCreateCustomMaterializationUnit function. Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any clients of the LLVMOrcAbsoluteSymbols API. Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to allow clients to get a reference to an LLJIT instance's linking layer, then emit an object file using it. This can be used to support construction of custom materialization units in the common case where those units will generate an object file that needs to be emitted to complete the materialization.	2021-04-26 13:58:37 -07:00
Lang Hames	8d718a0bff	[ORC] Fix type name. Rename JITTargetSymbolFlags to JITSymbolTargetFlags. This matches the convention used for JITSymbolGenericFlags.	2021-04-26 13:58:37 -07:00
Fangrui Song	b540311781	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful and not noisy, we can replace `llvm::Statistic` with `llvm::TrackingStatistic` in the future. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 13:39:35 -07:00
William S. Moses	494e77138c	[Lexer] Allow LLLexer to be used as an API Explose LLVM Lexer for usage externally as an API Differential Revision: https://reviews.llvm.org/D100920	2021-04-26 12:43:14 -04:00
Paul C. Anagnostopoulos	2d4c4d3c54	[TableGen] Change assertion information from a tuple to a struct [NFC] Differential Revision: https://reviews.llvm.org/D100854	2021-04-26 09:57:16 -04:00
Tim Renouf	8710eff6c3	[MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly 1. Add an accessor function to MCSymbolizer to retrieve addresses referenced by a symbolizable operand, but not resolved to a symbol. That way, the caller can synthesize labels at those addresses and then retry disassembling the section. 2. Implement that in AMDGPU -- a failed symbol lookup results in the address being added to a vector returned by the new function. 3. Use that in llvm-objdump when using MCSymbolizer (which only happens on AMDGPU) and SymbolizeOperands is on. Differential Revision: https://reviews.llvm.org/D101145 Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e	2021-04-26 13:56:36 +01:00
David Sherwood	a458b7855e	[AArch64] Add AArch64TTIImpl::getMaskedMemoryOpCost function When vectorising for AArch64 targets if you specify the SVE attribute we automatically then treat masked loads and stores as legal. Also, since we have no cost model for masked memory ops we believe it's cheap to use the masked load/store intrinsics even for fixed width vectors. This can lead to poor code quality as the intrinsics will currently be scalarised in the backend. This patch adds a basic cost model that marks fixed-width masked memory ops as significantly more expensive than for scalable vectors. Tests for the cost model are added here: Transforms/LoopVectorize/AArch64/masked-op-cost.ll Differential Revision: https://reviews.llvm.org/D100745	2021-04-26 11:00:03 +01:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Tomasz Miąsko	1cea7ab4ba	[demangler] Use standard semantics for StringView::substr The StringView::substr now accepts a substring starting position and its length instead of previous non-standard `from` & `to` positions. All uses of two argument StringView::substr are in MicrosoftDemangler and have 0 as a starting position, so no changes are necessary. This also fixes a bug where attempting to extract a suffix with substr (a `to` position equal to size) would return a substring without the last character. Fixing the issue should not introduce observable changes in the demangler, since as currently used, a second argument to StringView::substr is either: 1) a result of a successful call to StringView::find and so necessarily smaller than size., or 2) in the case of Demangler::demangleCharLiteral potentially equal to size, but with demangler expecting more data to follow later on and failing either way. Reviewed By: #libc_abi, ldionne, erik.pilkington Differential Revision: https://reviews.llvm.org/D100246	2021-04-25 13:56:41 +02:00
Xiang1 Zhang	3b8ec86fd5	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Lang Hames	c572ff840f	[ORC][C-bindings] Fix missing ')' in comments.	2021-04-24 18:04:57 -07:00
Nikita Popov	95af971764	[PatternMatch] Improve m_Deferred() documentation (NFC) m_Deferred() has nothing to do with commutative matchers, it needs to be used whenever the value to match is determinde as part of the same match expression.	2021-04-24 21:00:24 +02:00
RamNalamothu	0ce723cb22	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-24 23:29:42 +05:30
dfukalov	6c57044231	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Teresa Johnson	10b781fb03	Mark type test intrinsics as speculatable to fix inline cost There is already code in InlineCost.cpp to identify and ignore ephemeral values (llvm.assume intrinsics and other side-effect free instructions only feeding the assumes). However, because llvm.type.test intrinsics were not marked speculatable, they and any instructions specifically feeding the type test (typically a bitcast) were being counted towards the instruction cost when inlining. This was causing profile matching issues in some cases when enabling -fwhole-program-vtables for whole program devirtualization. According to the language reference, the speculatable attribute means: "the function does not have any effects besides calculating its result and does not have undefined behavior". I see no reason why type tests cannot be marked with this attribute. There are 2 test changes: llvm/test/Transforms/Inline/ephemeral.ll: I added a type test intrinsic here to verify the fix. Also, I found the test was not actually testing what it originally intended. Many of the existing instructions were optimized away by -Oz, and the cost of inlining was negative due to the benefit of removing the call. So I changed the test to simply invoke the inline pass and check the number of instructions computed by InlineCost. I also fixed an instruction that was not actually used anywhere. llvm/test/Transforms/SimplifyCFG/no-md-sink.ll needed to be made more robust to code changes that reordered the metadata. Differential Revision: https://reviews.llvm.org/D101180	2021-04-23 10:02:31 -07:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Sander de Smalen	f9a50f04ba	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Sander de Smalen	43ace8b5ce	[TTI] NFC: Change getScalingFactorCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100564	2021-04-23 16:06:36 +01:00
Sander de Smalen	008a072ded	[TTI] NFC: Change getMemcpyCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100563	2021-04-23 16:06:35 +01:00
Sander de Smalen	9ba07f37f8	[TTI] NFC: Change getGEPCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100562	2021-04-23 16:06:35 +01:00
Sander de Smalen	e0edfa052f	[TTI] NFC: Change getAddressComputationCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100561	2021-04-23 16:06:35 +01:00
dfukalov	9ab17a60eb	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Daniil Fukalov	f79d055791	[TTI] Fix ScalarizationCost initialization. In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101099	2021-04-23 17:59:59 +03:00
Stephen Tozer	791930d740	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in `e5d844b587`. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit `dad5caa59e`.	2021-04-23 10:54:01 +01:00
Jay Foad	63af3c000b	[GlobalISel] Remove ConstantFoldingMIRBuilder ConstantFoldingMIRBuilder was an experiment which is not used for anything. The constant folding functionality is now part of CSEMIRBuilder. Differential Revision: https://reviews.llvm.org/D101050	2021-04-23 09:13:27 +01:00
Fangrui Song	2786e673c7	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Craig Topper	e01c419ecd	[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. These instructions don't really exist, but we have ways we can emulate them. .vv will swap operands and use vmsle().vv. .vi will adjust the immediate and use .vmsgt(u).vi when possible. For .vx we need to use some of the multiple instruction sequences from the V extension spec. For unmasked vmsge(u).vx we use: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd For cases where mask and maskedoff are the same value then we have vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that requires a temporary so we use: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt For other masked cases we use this sequence: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 We trust that register allocation will prevent vd in vmslt{u}.vx from being v0 since v0 is still needed by the vmxor. Differential Revision: https://reviews.llvm.org/D100925	2021-04-22 10:44:38 -07:00
Fangrui Song	ef5e7f90ea	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit `77ac823fd2`. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Krzysztof Parzyszek	deda60fcaf	[Hexagon] Add HVX intrinsics for conditional vector loads/stores Intrinsics for the following instructions are added. The intrinsic name is "int_hexagon_<inst>[_128B]", e.g. int_hexagon_V6_vL32b_pred_ai for 64-byte version int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32 V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32	2021-04-22 11:49:29 -05:00
Irina Dobrescu	123ae42566	[flang][openmp] Add General Semantic Checks for Allocate Directive This patch adds semantic checks for the General Restrictions of the Allocate Directive. Since the requires directive is not yet implemented in Flang, the restriction: ``` allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit ``` will need to be updated at a later time. A different patch will be made with the Fortran specific restrictions of this directive. I have used the code from https://reviews.llvm.org/D89395 for the CheckObjectListStructure function. Co-authored-by: Isaac Perry <isaac.perry@arm.com> Reviewed By: clementval, kiranchandramohan Differential Revision: https://reviews.llvm.org/D91159	2021-04-22 16:15:06 +00:00
Simon Pilgrim	41091614d6	[LTO] Caching.h - remove unused <string> include. NFCI.	2021-04-22 14:07:12 +01:00
Jun Ma	978eb3f168	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Jay Foad	82d34fe2b3	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Wenlei He	dff8315892	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Max Kazantsev	8fe62b7af1	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	77ac823fd2	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Hongtao Yu	1a719089a8	[CSSPGO][llvm-profgen] Always report dangling probes for frames with real samples. Report dangling probes for frames that have real samples collected. Dangling probes are the probes associated to an empty block. When reported, sample count on a dangling probe will not be trusted by the compiler and we will rely on the counts inference algorithm to get the probe a reasonable count. This actually fixes a bug where previously only those dangling probes with samples collected were reported. This patch also fixes two existing issues. Pseudo probes are stored in `Address2ProbesMap` and their pointers are used in `PseudoProbeInlineTree`. Previously `std::vector` was used to store probes and the pointers to probes may get obsolete as the vector grows. I'm changing `std::vector` to `std::list` instead. The other issue is that all outlined functions shared the same inline frame previously due to the unchanged `Index` value as the dummy inlineSite identifier. Good results seen for SPEC2017 in general regarding profile quality. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D100235	2021-04-21 18:07:58 -07:00
Fangrui Song	ac303795a7	[IR] Add doc about Function::createWithDefaultAttr. NFC	2021-04-21 16:20:50 -07:00
Fangrui Song	775a9483e5	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
dfukalov	a8b35e0f52	[TTI] NFC: Change getVectorSplitCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100952	2021-04-21 17:32:02 +03:00
Anirudh Prasad	8f6185c713	[AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string. - Previously, https://reviews.llvm.org/D72680 introduced a new attribute called `AllowSymbolAtNameStart` (in relation to the MAsmParser changes) in `MCAsmInfo.h` which (according to the comment in the header) allows the following behaviour: ``` /// This is true if the assembler allows $ @ ? characters at the start of /// symbol names. Defaults to false. ``` - However, the usage of this field in AsmLexer.cpp doesn't seem completely accurate* for a couple of reasons. ``` default: if (MAI.doesAllowSymbolAtNameStart()) { // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]* if (!isDigit(CurChar) && isIdentifierChar(CurChar, MAI.doesAllowAtInName(), AllowHashInIdentifier)) return LexIdentifier(); } ``` 1. The Dollar and At tokens, when occurring at the start of the string, are treated as separate tokens (AsmToken::Dollar and AsmToken::At respectively) and not lexed as an Identifier. 2. I'm not too sure why `MAI.doesAllowAtInName()` is used when `AllowAtInIdentifier` could be used. For X86 platforms, afaict, this shouldn't be an issue, since the `CommentString` attribute isn't "@". (alternatively the call to the setter can be set anywhere else as needed). The `AllowAtInName` does have an additional important meaning, but in the context of AsmLexer, shouldn't mean anything different compared to `AllowAtInIdentifier` My proposal is the following: - Introduce 3 new fields called `AllowQuestionTokenAtStartOfString`, `AllowDollarTokenAtStartOfString` and `AllowAtTokenAtStartOfString` in MCAsmInfo.h which will encapsulate the previously documented behaviour of "allowing $, @, ? characters at the start of symbol names") - Introduce these fields where "$", "@" are lexed, and treat them as identifiers depending on whether `Allow[Dollar\|At]TokenAtStartOfString` is set. - For the sole case of "?", append it to the existing logic for treating a "default" token as an Identifier. z/OS (HLASM) will also make use of some of these fields in follow up patches. completely accurate* - This was based on the comments and the intended behaviour the code. I might have completely misinterpreted it, and if that is the case my sincere apologies. We can close this patch if necessary, if there are no changes to be made :) Depends on https://reviews.llvm.org/D99374 Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D99889	2021-04-21 10:21:09 -04:00
Nico Weber	ba7a92c01e	[Support] Don't include VirtualFileSystem.h in CommandLine.h CommandLine.h is indirectly included in ~50% of TUs when building clang, and VirtualFileSystem.h is large. (Already remarked by jhenderson on D70769.) No behavior change. Differential Revision: https://reviews.llvm.org/D100957	2021-04-21 10:19:01 -04:00
Simon Pilgrim	68b9b769b5	[MC] MCInstrDesc.h - remove unnecessary <string> include. NFCI.	2021-04-21 15:07:00 +01:00
Fraser Cormack	e6ff89dc2e	[SelectionDAG] Fix minor typo in ISDOpcodes.h. NFC	2021-04-21 14:38:07 +01:00
serge-sans-paille	d9806334d1	Use SmallVector instead of std::vector to manage storage of llvm::BitVector This is a follow-up to https://reviews.llvm.org/D100387. std::vector is not the best storage container here. My local benchmark (counting the number of instruction when compiling the sqlite3 amalgamation) yields the following: - std::vector<BitVector> -> 5,860,885,896 - SmallVector<BitWord, 0> -> 5,858,991,997 - SmallVector<BitWord> -> 5,817,679,224 Differential Revision: https://reviews.llvm.org/D100744	2021-04-21 07:31:28 +02:00
Arthur Eubanks	dd56715326	[NFC] Remove redundant InstCombinePass name	2021-04-20 22:23:07 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Roman Lebedev	7186764884	[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr()	2021-04-20 21:29:21 +03:00
Joseph Huber	b2ad63d3cf	[OpenMP] Add OpenMPOpt as a Module pass Summary: This patch registers OpenMPOpt as a Module pass in addition to a CGSCC pass. This is so certain optimzations that are sensitive to intact call-sites can happen before inlining. The old `openmpopt` pass name is changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass. The current module pass only runs a single check but will be expanded in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99202	2021-04-20 12:28:58 -04:00
Simon Pilgrim	2a419a0b99	[X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements	2021-04-20 17:09:49 +01:00
Ahmed Bougacha	a8a3a43792	[AArch64] Add apple-m1 CPU, and default to it for macOS. apple-m1 has the same level of ISA support as apple-a14, so this is a straightforward mechanical change. However, that also means this inherits apple-a14's v8.5a+nobti quirkiness. rdar://68287159	2021-04-20 08:41:04 -07:00
Matt Arsenault	83a25a1010	GlobalISel: Restrict narrow scalar for fptoui/fptosi results This practically only works for the f16 case AMDGPU uses, not wider types. Fixes bug 49710 by failing legalization.	2021-04-20 10:54:40 -04:00
Andrea Di Biagio	2226d21896	[MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups. Make sure that the `CriticalMemoryInstruction` of a memory group is invalidated if it references an already executed instruction. This avoids a potential use-after-free if the critical memory info becomes stale, and the value is read after the instruction has executed.	2021-04-20 13:30:45 +01:00
Cullen Rhodes	8a6772f3aa	[ValueTypes] Fix sizes of v256i32 and v256f32 (8182 -> 8192)	2021-04-20 12:10:02 +00:00
Simon Pilgrim	5ed8cea9a8	[Support] APInt.h - remove <algorithm> include. NFCI. Replace std::min use which should allow us to avoid including the <algorithm> header in every include of APInt.h.	2021-04-20 11:21:39 +01:00
Simon Pilgrim	1c6df71a9b	[CodeGen] CodeGenPassBuilder.h - remove unnecessary <string> include. NFCI. We only use StringRef so include that.	2021-04-20 11:21:39 +01:00
Simon Pilgrim	2ea6ed9b70	[Support] BinaryStreamReader.h - remove unnecessary <string> include. NFCI. We only use StringRef so include that.	2021-04-20 10:31:12 +01:00
Arthur Eubanks	5e71b9fa93	Explicitly pass type to cast load constant folding result Previously we would use the type of the pointee to determine what to cast the result of constant folding a load. To aid with opaque pointer types, we should explicitly pass the type of the load rather than looking at pointee types. ConstantFoldLoadThroughBitcast() converts the const prop'd value to the proper load type (e.g. [1 x i32] -> i32). Instead of calling this in every intermediate step like bitcasts, we only call this when we actually see the global initializer value. In some existing uses of this API, we don't know the exact type we're loading from immediately (e.g. first we visit a bitcast, then we visit the load using the bitcast). In those cases we have to manually call ConstantFoldLoadThroughBitcast() when simplifying the load to make sure that we cast to the proper type. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100718	2021-04-20 00:53:21 -07:00
Fraser Cormack	457da7f298	[SelectionDAG] Relax constraints on STEP_VECTOR step operand This patch relaxes the requirement that the STEP_VECTOR step constant must be of a type at least as large as the vector element type. This does not permit its use on targets which have legal vector element types larger than the largest legal scalar type, such as i64 vectors on RV32. As such, the requirement has been loosened so that the step operand must be any scalar type so long as the constant immediate is non-negative and the value fits inside the vector element type. This limits combining optimizations in certain circumstances but in practice it's unlikely to be a hindrance. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D100660	2021-04-20 08:41:42 +01:00
Hongtao Yu	1812319292	[CSSPGO] Flip SkipPseudoOp to true for MIR APIs. Flipping the default value of SkipPseudoOp to true for those MIR APIs to favor maximum performance. Note that certain spots like branch folding and MIR if-conversion is are disabled for better counts quality. For these two optimizations, this is a no-diff change. The counts quality with SPEC2017 before/after this change is unchanged. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100332	2021-04-19 17:55:34 -07:00
David Penry	ca8eef7e3d	[CodeGen] Use ProcResGroup information in SchedBoundary When the ProcResGroup has BufferSize=0, 1. if there is a subunit in the list of write resources for the scheduling class, do not attempt to schedule the ProcResGroup. 2. if there is not a subunit in the list of write resources for the scheduling class, choose a subunit to use instead of the ProcResGroup. 3. having both the ProcResGroup and any of its subunits in the resources implied by a InstRW is not supported. Used to model parallel uses from a pool of resources. Differential Revision: https://reviews.llvm.org/D98976	2021-04-19 21:27:45 +01:00
Jinsong Ji	d88d8c5b86	[PowerPC] Disable relative lookup table converter pass for AIX XCOFF hasn't implemented lowerRelativeReference. So we need to disable new pass introduced by https://reviews.llvm.org/D94355 for AIX for now. Reviewed By: gulfem Differential Revision: https://reviews.llvm.org/D100584	2021-04-19 19:28:11 +00:00
Nick Desaulniers	c440b97d89	[TargetLowering] move "o" and "X" constraint handling to base class These constraints are machine agnostic; there's no reason to handle these per-arch. If arches don't support these constraints, then they will fail elsewhere during instruction selection. We don't need virtual calls to look these up; TargetLowering::getInlineAsmMemConstraint should only be overridden by architectures with additional unique memory constraints. Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D100416	2021-04-19 10:53:31 -07:00
Jessica Paquette	91bbb914e0	[AArch64][GlobalISel] Regbankselect + select @llvm.aarch64.neon.uaddlv It turns out we actually import a bunch of selection code for intrinsics. The imported code checks that the register banks on the G_INTRINSIC instruction are correct. If so, it goes ahead and selects it. This adds code to AArch64RegisterBankInfo to allow us to correctly determine register banks on intrinsics which have known register bank constraints. For now, this only handles @llvm.aarch64.neon.uaddlv. This is necessary for porting AArch64TargetLowering::LowerCTPOP. Also add a utility for getting the intrinsic ID from a G_INTRINSIC instruction. This seems a little nicer than having to know about how intrinsic instructions are structured. Differential Revision: https://reviews.llvm.org/D100398	2021-04-19 10:47:49 -07:00
Roman Lebedev	b8a3705896	[NFCI][SCEVExpander] Extract GetOptimalInsertionPointForCastOf() helper	2021-04-19 18:38:38 +03:00
Simon Pilgrim	ddcdeae358	[Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI. Move <string> include to ImportedFunctionsInliningStatistics.cpp and add missing <memory> include as we have explicit uses of std::unique_ptr in the header.	2021-04-19 16:20:56 +01:00
Simon Pilgrim	3b02de173b	[Support] Memory.h - remove unnecessary <string> include. NFCI. protectMappedMemory no longer returns an error message, so we don't need std::string - I've fixed an unnecessary doxygen entry as well (oddly I wasn't seeing a Wdocumentation warning)	2021-04-19 14:32:07 +01:00
Abhina Sreeskantharajan	05b4babc9d	[SystemZ][z/OS] Set more text files as text This patch corrects more instances of text files being opened as text. Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D100654	2021-04-19 09:31:46 -04:00
Paul C. Anagnostopoulos	a5aaec8f4e	[TableGen] Add support for the 'assert' statement in multiclasses This is step 3 of adding the 'assert' statement. Differential Revision: https://reviews.llvm.org/D99751	2021-04-19 09:01:42 -04:00
Simon Pilgrim	cf2fc41bd1	[IR] GlobalObject.h - remove unused <utility> include. NFCI. In fact there's no explicit use of any std:: type or method in this header.	2021-04-19 13:25:35 +01:00
Simon Pilgrim	228207fe94	[IR] GlobalObject.h - remove unused <string> include. NFCI. All string usage is hidden behind StringRefs - so we don't need an explicit <string> include.	2021-04-19 12:56:10 +01:00
Simon Pilgrim	7f0ea5c8b6	[MCA] CodeEmitter.h - remove unused <string> include. NFCI. Add explicit SmallString.h include - which is used in the header	2021-04-19 12:56:09 +01:00
Cullen Rhodes	f0bc2782f2	[TTI] NFC: Remove unused 'OptSize' parameter from shouldMaximizeVectorBandwidth Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100377	2021-04-19 11:01:34 +00:00
OCHyams	0ebf9a8e34	[DebugInfo] Move the findDbg* functions into DebugInfo.cpp Move the findDbg* functions into lib/IR/DebugInfo.cpp from lib/Transforms/Utils/Local.cpp. D99169 adds a call to a function (findDbgUsers) that lives in lib/Transforms/Utils/Local.cpp (LLVMTransformUtils) from lib/IR/Value.cpp (LLVMCore). The Core lib doesn't include TransformUtils. The builtbots caught this here: https://lab.llvm.org/buildbot/#/builders/109/builds/12664. This patch moves the function, and the 3 similar ones for consistency, into DebugInfo.cpp which is part of LLVMCore. Reviewed By: dblaikie, rnk Differential Revision: https://reviews.llvm.org/D100632	2021-04-19 10:30:25 +01:00
Roman Lebedev	d480f968ad	Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`" As being discussed in https://reviews.llvm.org/D100721, this modelling is lossy, we can't reconstruct `ash`/`ashr exact` from it, which means that whenever we actually expand the IR, we've just pessimized the code.. It would be good to model this pattern, after all it comes up every time you want to compute a distance between two pointers, but not at this cost. This reverts commit `ec54867df5`.	2021-04-18 16:26:45 +03:00
Juneyoung Lee	2813acb7d1	Update m_Undef to match vectors/aggrs with undefs and poisons mixed This fixes https://reviews.llvm.org/D93990#2666922 by teaching `m_Undef` to match vectors/aggrs with poison elements. As suggested, fixes in InstCombine files to use the `m_Undef` matcher instead of `isa<UndefValue>` will be followed. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100122	2021-04-18 10:57:04 +09:00
Florian Hahn	d91f864ced	[ADT] Update RPOT to work with specializations of different types. At the moment, ReversePostOrderTraversal performs a post-order walk on the entry node of the passed in graph, rather than the graph type itself. If GT::NodeRef is the same as GraphT, everything works as expected and this is the case for the current uses in-tree. But it does not work as expected if GraphT != GT::NodeRef. In that case, we either fail to build (if there is no GraphTrait specialization for GT:NodeRef) or we pick the GraphTrait specialization for GT::NodeRef, instead of the specialization of GraphT. Both the depth-first and post-order iterators pick the expected specalization and this patch updates ReversePostOrderTraversal to delegate to po_begin & po_end to pick the right specialization, rather than forcing using GraphTraits<GT::NodeRef>, by first getting the entry node. This makes `ReversePostOrderTraversal<Graph<6>> RPOT(G);` build and work as expected in the test. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D100169	2021-04-17 20:45:04 +01:00
Florian Hahn	bbf01f96b5	[ADT] Take graph as const & in some post-order iterators (NFC). This patch updates a couple of functions that unnecessarily took the input graph by value, when it was not needed. They can take the graph by const-reference instead, which does not require GraphT to provide a copy constructor. Split off from D100169.	2021-04-17 17:05:24 +01:00
Simon Pilgrim	595394321d	[Support] AbsoluteDifference - add brackets to appease static analyzer warning. NFCI.	2021-04-17 13:47:02 +01:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
serge-sans-paille	550ed575cb	Simplify BitVector code Instead of managing memory by hand, delegate it to std::vector. This makes the code much simpler, and also avoids repeatedly computing the storage size. According to valgrind --tool=callgrind, this also slightly decreases the instruction count, but by a small margin. This is a recommit of `82f0e3d3ea` with one usage fixed in llvm/lib/CodeGen/RegisterScavenging.cpp. Not the slight API change: BitVector::clear() now has the same behavior as any other container: it does not free memory, but indeed sets the size of the BitVector to 0. It is thus incorrect to access its content right afterwards, a scenario which wasn't enforced in previous implementation. Differential Revision: https://reviews.llvm.org/D100387	2021-04-16 22:48:33 +02:00
Thomas Lively	5c729750a6	[WebAssembly] Remove saturating fp-to-int target intrinsics Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596	2021-04-16 12:11:20 -07:00
Nico Weber	0b36a33ab8	Reland "[Support] Don't include <algorithm> in MathExtras.h" This reverts commit `af2a93fd6e`. This time, add the include to APInt.h, which apparently relied on getting this include transitively.	2021-04-16 14:07:45 -04:00
Stella Stamenova	af2a93fd6e	Revert "[Support] Don't include <algorithm> in MathExtras.h" This reverts commit `6580d8a2b1`.	2021-04-16 10:22:32 -07:00
Nick Lewycky	244d9d6e41	Verify the LLVMContext that an Attribute belongs to. Attributes don't know their parent Context, adding this would make Attribute larger. Instead, we add hasParentContext that answers whether this Attribute belongs to a particular LLVMContext by checking for itself inside the context's FoldingSet. Same with AttributeSet and AttributeList. The Verifier checks them with the Module context. Differential Revision: https://reviews.llvm.org/D99362	2021-04-16 09:44:38 -07:00
Stanislav Mekhanoshin	0777d1ec06	Ignore assume like calls by default in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96179	2021-04-16 09:37:33 -07:00
Nico Weber	da62725874	[ADT] Don't include <algorithm> in iterator.h As far as I can tell, nothing in iterator.h uses anything from <algorithm>. Differential Revision: https://reviews.llvm.org/D100659	2021-04-16 12:21:08 -04:00
Michael Liao	853da5977e	Revert "[Support] Don't include <algorithm> in Hashing.h" This reverts commit `ef620c40f3`. - `std::rotate` still needs <alogirthm>	2021-04-16 12:17:42 -04:00
Nico Weber	ef620c40f3	[Support] Don't include <algorithm> in Hashing.h The include is for std::swap(), but that's in <utility> in C++11, and Hashing.h already includes that. Differential Revision: https://reviews.llvm.org/D100657	2021-04-16 12:04:53 -04:00
Nico Weber	6580d8a2b1	[Support] Don't include <algorithm> in MathExtras.h MathExtras.h is indirectly included in over 98% of LLVM's translation units. It currently expands to over 1MB of stuff, over which far more than half is due to <algorithm>. Since not using <algorithm> is slightly less code, do that. No behavior change. Differential Revision: https://reviews.llvm.org/D100656	2021-04-16 11:53:31 -04:00
Mats Petersson	517c3aee4d	[OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic The implementation supports static schedule for Fortran do loops. This implements the dynamic variant of the same concept. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D97393	2021-04-16 16:09:49 +01:00
Abhina Sreeskantharajan	3be2ba0ba3	[SystemZ][z/OS][Windows] Add new functions that set Text/Binary mode for Stdin and Stdout based on OpenFlags On Windows, we want to open a file in Binary mode if OF_CRLF bit is not set. On z/OS, we want to open a file in Binary mode if the OF_Text bit is not set. This patch creates two new functions called ChangeStdinMode and ChangeStdoutMode which will take OpenFlags as an arg to determine which mode to set stdin and stdout to. This will enable patches like https://reviews.llvm.org/D100056 to not affect Windows when setting the OF_Text flag for raw_fd_streams. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D100130	2021-04-16 08:09:19 -04:00
Arthur Eubanks	9c776c2fa2	[NFC][NewPM] Remove some AnalysisManager invalidate methods These were misleading, they're more of a "clear" than an "invalidate". We shouldn't be individually clearing analysis results. Either we clear all analyses when some IR becomes invalid, or we properly go through invalidation. There was only one use of this, which can be simulated with AM.invalidate(F, PA). Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D100519	2021-04-15 16:51:26 -07:00
River Riddle	706c9c5ce0	[mlir] Add support for walking locations similarly to Operations This allows for walking all nested locations of a given location, and is generally useful when processing locations. Differential Revision: https://reviews.llvm.org/D100437	2021-04-15 16:09:34 -07:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
cchen	e0c2125d1d	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
Arthur Eubanks	c8f0a7c215	[NewPM] Cleanup IR printing instrumentation Being lazy with printing the banner seems hard to reason with, we should print it unconditionally first (it could also lead to duplicate banners if we have multiple functions in -filter-print-funcs). The printIR() functions were doing too many things. I separated out the call from PrintPassInstrumentation since we were essentially doing two completely separate things in printIR() from different callers. There were multiple ways to generate the name of some IR. That's all been moved to getIRName(). The printing of the IR name was also inconsistent, now it's always "IR Dump on $foo" where "$foo" is the name. For a function, it's the function name. For a loop, it's what's printed by Loop::print(), which is more detailed. For an SCC, it's the list of functions in parentheses. For a module it's "[module]", to differentiate between a possible SCC with a function called "module". To preserve D74814, we have to check if we're going to print anything at all first. This is unfortunate, but I would consider this a special case that shouldn't be handled in the core logic. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D100231	2021-04-15 09:50:55 -07:00
LemonBoy	24185541ca	[yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100250	2021-04-15 15:54:28 +02:00
Alex Orlov	49cbf4cd85	Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel This fixes the following bugs: https://bugs.llvm.org/show_bug.cgi?id=27249 https://bugs.llvm.org/show_bug.cgi?id=46414 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100328	2021-04-15 15:06:20 +04:00
dfukalov	ce1626f34a	[AA] Updates for D95543. Addressing latter comments in D95543: - `AliasResult::Result` renamed to `AliasResult::Kind` - Offset printing added for `PartialAlias` case in `-aa-eval` - Removed VisitedPhiBBs check from BasicAA' Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D100454	2021-04-15 12:22:03 +03:00
Nikita Popov	a1ed025d0e	Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting" This reverts commit `faf9f11589`. Issues with this patch have been reported in https://reviews.llvm.org/D100264#2689917 and https://bugs.llvm.org/show_bug.cgi?id=49967.	2021-04-15 09:43:52 +02:00
Nico Weber	d5e8dca1b6	fix comment typos to cycle bots	2021-04-14 22:12:56 -04:00
Alexander Yermolovich	b7459a10da	[DWARF] Fix crash for DWARFDie::dump. When DIE is extracted manually, the DieArray is empty. When dump is invoked on aforementioned DIE it tries to extract child, even if Dump options say otherwise. Resulting in crash. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99698	2021-04-14 18:46:34 -07:00
Sterling Augustine	8f9477b067	Revert "Simplify BitVector code" This reverts commit `82f0e3d3ea`. The change breaks the asan buildbots. https://lab.llvm.org/buildbot/#/builders/99/builds/2835	2021-04-14 18:06:51 -07:00
Philip Reames	dd985551c2	Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches. This reverts commit `ab98f2c712` and `98eea392cd`. It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.	2021-04-14 16:38:07 -07:00
Nico Weber	ab98f2c712	Revert "[InferAttributes] Materialize all infered attributes for declaration" Breaks check-clang, see comments on D100400 Also revert follow-up "[NFC] Move a recently added utility into a location to enable reuse" This reverts commit `3ce61fb6d6`. This reverts commit `61a85da882`.	2021-04-14 18:41:20 -04:00
Philip Reames	3ce61fb6d6	[NFC] Move a recently added utility into a location to enable reuse About to refresh a patch that uses this in FunctionAtrrs, doing the move seperately to control build times.	2021-04-14 15:05:16 -07:00
Thomas Lively	6a18cc23ef	[WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u} Removes the builtins and intrinsics used to opt in to using these instructions and replaces them with normal ISel patterns now that they are no longer prototypes. Differential Revision: https://reviews.llvm.org/D100402	2021-04-14 13:43:09 -07:00
serge-sans-paille	82f0e3d3ea	Simplify BitVector code Instead of managing memory by hand, delegate it to std::vector. This makes the code much simpler, and also avoids repeatedly computing the storage size. According to valgrind --tool=callgrind, this also slightly decreases the instruction count, but by a small margin. Differential Revision: https://reviews.llvm.org/D100387	2021-04-14 21:28:08 +02:00
Thomas Lively	af7925b4dd	[WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u} Add a custom DAG combine and ISD opcode for detecting patterns like (uint_to_fp (extract_subvector ...)) before the extract_subvector is expanded to ensure that they will ultimately lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions are no longer prototypes and can now be produced via standard IR, this commit also removes the target intrinsics and builtins that had been used to prototype the instructions. Differential Revision: https://reviews.llvm.org/D100425	2021-04-14 10:42:45 -07:00
Momchil Velikov	a0124f4e4d	Remove deprecated member functions (NFC) Remove the member functions getByValAlign and getOrigAlign, there were no users left. Differential Revision: https://reviews.llvm.org/D99098	2021-04-14 18:06:53 +01:00
Sander de Smalen	4f42d873c2	[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100317	2021-04-14 17:20:36 +01:00
Sander de Smalen	d84bd951a8	[TTI] NFC: Change getFPOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100316	2021-04-14 17:20:36 +01:00
Sander de Smalen	1af35e77f4	[TTI] NFC: Change getVectorInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100315	2021-04-14 17:20:35 +01:00
Sander de Smalen	174e8f6c5e	[TTI] NFC: Change getShuffleCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100314	2021-04-14 17:20:35 +01:00
Sander de Smalen	14b934f8a6	[TTI] NFC: Change getCFInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D100313	2021-04-14 17:20:34 +01:00
Sander de Smalen	596f669cfb	[TTI] NFC: Change getCallInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D100312	2021-04-14 17:20:34 +01:00
Thomas Lively	af7ab81ce3	[WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops Now that these instructions are no longer prototypes, we do not need to be careful about keeping them opt-in and can use the standard LLVM infrastructure for them. This commit removes the bespoke intrinsics we were using to represent these operations in favor of the corresponding target-independent intrinsics. The clang builtins are preserved because there is no standard way to easily represent these operations in C/C++. For consistency with the scalar codegen in the Wasm backend, the intrinsic used to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though @llvm.roundeven better captures the semantics of the underlying Wasm instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is left to a potential future patch. Differential Revision: https://reviews.llvm.org/D100411	2021-04-14 09:19:27 -07:00
Sjoerd Meijer	39d29817f3	[SCCP] Follow up of rGbbab9f986c6d. NFC. This addresses the linter messages, mainly the inconsistent capitalisation of member functions.	2021-04-14 17:14:46 +01:00
Sjoerd Meijer	bbab9f986c	[SCCP] Create SCCP Solver This refactors SCCP and creates a SCCPSolver interface and class so that it can be used by other passes and transformations. We will use this in D93838, which adds a function specialisation pass. This is based on an early version by Vinay Madhusudan. Differential Revision: https://reviews.llvm.org/D93762	2021-04-14 14:58:03 +01:00
Martin Storsjö	57b259a852	[Passes] Enable the relative lookup table converter pass on aarch64 After `d5c5cf5ce8`, it should work fine for aarch64 on COFF too. (It was disabled when the patch was (re)applied in `e96df3e531`, pending that fix.)	2021-04-14 13:15:41 +03:00
Pengfei Wang	184377da5c	[LLD] Implement /guard:[no]ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99078	2021-04-14 15:06:49 +08:00
Daniel Sanders	be50657c6a	[TableGen] Resolve concrete but not complete field access initializers This fixes the resolution of Rec10.Zero in ListSlices.td. As part of this, correct the definition of complete for ListInit such that it's complete iff all the elements in the list are complete rather than always being complete regardless of the elements. This is the reason Rec10.TwoFive from ListSlices.td previously resolved despite being incomplete like Rec10.Zero was Depends on D100247 Reviewed By: Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D100253	2021-04-13 15:14:56 -07:00
Anirudh Prasad	6ddd8c28b7	[AsmParser][SystemZ][z/OS] Add support to AsmLexer to accept HLASM style integers - Add support for HLASM style integers. These are the decimal integers [0-9]. - HLASM does not support the additional prefixed integers like, `0b`, `0x`, octal integers and Masm style integers. - To achieve this, a field `LexHLASMStyleIntegers` (similar to the `LexMasmStyleIntegers` field) is introduced in `MCAsmLexer.h` as well as a corresponding setter. Note: This field could also go into MCAsmInfo.h. I used the previous precedent set by the `LexMasmIntegers` field. Depends on https://reviews.llvm.org/D99286 Reviewed By: epastor Differential Revision: https://reviews.llvm.org/D99374	2021-04-13 15:29:37 -04:00
Nikita Popov	faf9f11589	[SCEV] Don't walk uses of phis without SCEV expression when forgetting I've run into some cases where a large fraction of compile-time is spent invalidating SCEV. One of the causes is forgetLoop(), which walks all values that are def-use reachable from the loop header phis. When invalidating a topmost loop, that might be close to all values in a function. Additionally, it's fairly common for there to not actually be anything to invalidate, but we'll still be performing this walk again and again. My first thought was that we don't need to continue walking the uses if the current value doesn't have a SCEV expression. However, this isn't quite right, because SCEV construction can skip over values (e.g. for a chain of adds, we might only create a SCEV expression for the final value). What this patch does instead is to only walk the (full) def-use chain of loop phis that have a SCEV expression. If there's no expression for a phi, then we also don't have any dependent expressions to invalidate. Differential Revision: https://reviews.llvm.org/D100264	2021-04-13 20:28:17 +02:00
Anirudh Prasad	f7eec83932	[AsmParser][SystemZ][z/OS] Add in support to allow use of additional comment strings. - Currently, MCAsmInfo provides a CommentString attribute, that various targets can set, so that the AsmLexer can appropriately lex a string as a comment based on the set value of the attribute. - However, AsmLexer also supports a few additional comment syntaxes, in addition to what's specified as a CommentString attribute. This includes regular C-style block comments (/* ... /), regular C-style line comments (// .... ) and #. While I'm not sure as to why this behaviour exists, I am assuming it does to maintain backward compatibility with GNU AS (see https://sourceware.org/binutils/docs/as/Comments.html#Comments for reference) For example: Consider a target which sets the CommentString attribute to ''. The following strings are all lexed as comments. ``` "# abc" -> comment "// abc" -> comment "/* abc / -> comment " abc" -> comment ``` - In HLASM however, only "*" is accepted as a comment string, and nothing else. - To achieve this, an additional attribute (`AllowAdditionalComments`) has been added to MCAsmInfo. If this attribute is set to false, then only the string specified by the CommentString attribute is used as a possible comment string to be lexed by the AsmLexer. The regular C-style block comments, line comments and "#" are disabled. As a final note, "#" will still be treated as a comment, if the CommentString attribute is set to "#". Depends on https://reviews.llvm.org/D99277 Reviewed By: abhina.sreeskantharajan, myiwanch Differential Revision: https://reviews.llvm.org/D99286	2021-04-13 11:15:09 -04:00
Sander de Smalen	03f47bdcb1	[TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100205	2021-04-13 14:21:02 +01:00
Sander de Smalen	d676b5749d	[TTI] NFC: Change getMaskedMemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100204	2021-04-13 14:21:01 +01:00
Sander de Smalen	db134e2428	[TTI] NFC: Change getCmpSelInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100203	2021-04-13 14:21:01 +01:00
Sander de Smalen	2285dfb73f	[TTI] NFC: Change getMinMaxReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100202	2021-04-13 14:21:00 +01:00
Sander de Smalen	bd86824d98	[TTI] NFC: Change getArithmeticReductionCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html This patch is practically NFC, with the exception of an AArch64 SVE related cost-model change, where we can now return an Invalid cost instead of some bogus number. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100201	2021-04-13 14:20:59 +01:00
Sander de Smalen	fd1f8a5462	[TTI] NFC: Change getGatherScatterOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100200	2021-04-13 14:20:59 +01:00
Sander de Smalen	92d8421f49	[TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100199	2021-04-13 14:20:58 +01:00
Martin Storsjö	45f8946a75	[CodeView] Fix the ARM64 CPUType enum The old, incorrect one seems to have been added in `d41ac895bb`, with a similarly placed entry added in EnumTables.cpp in `eb4d6142dc`. This matches the value documented at https://docs.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cpu-type-e?view=vs-2019. This fixes running obj2yaml on an object file generated by MSVC. Differential Revision: https://reviews.llvm.org/D100306	2021-04-13 12:54:22 +03:00
Amy Huang	dad5caa59e	Revert "Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This change causes an assert / segmentation fault in LTO builds. This reverts commit `f2e4f3eff3`.	2021-04-12 20:10:17 -07:00
Freddy Ye	3fc1fe8db8	[X86] Support -march=rocketlake Reviewed By: skan, craig.topper, MaskRay Differential Revision: https://reviews.llvm.org/D100085	2021-04-13 09:48:13 +08:00
Gulfem Savrun Yeniceri	e96df3e531	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-04-13 01:29:41 +00:00
Arthur Eubanks	a8ab1f98d2	[Evaluator] Look through invariant.group intrinsics Turning on -fstrict-vtable-pointers in Chrome caused an extra global initializer. Turns out that a llvm.strip.invariant.group intrinsic was causing GlobalOpt to fail to step through some simple code. We can treat .invariant.group uses as simply their operand. Value::stripPointerCastsForAliasAnalysis() does exactly this. This should be safe because the Evaluator does not skip memory accesses due to invariants or alias analysis. However, we don't want to leak that we've stripped arbitrary pointer casts to users of Evaluator, so we bail out if we evaluate a function to any constant, since we may have looked through .invariant.group calls and aliasing pointers cannot be arbitrarily substituted. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D98843	2021-04-12 16:12:15 -07:00
Yuanfang Chen	c5fda0e662	Reland "Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom"" This reverts commit `a3fabc79ae` (relands `f4d682d6ce` with fix for the compile-time regression issue).	2021-04-12 14:50:54 -07:00
Nikita Popov	a3fabc79ae	Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom" This reverts commit `f4d682d6ce`. This caused a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=4b7bad9eaea2233521a94f6b096aaa88dc584e23&to=f4d682d6ce6c5b3a41a0acf297507c82f5c21eef&stat=instructions Possibly this is due to overeager parsing of target triples.	2021-04-12 22:55:59 +02:00
Hamza Sood	0a92aff721	Replace uses of std::iterator with explicit using This patch removes all uses of `std::iterator`, which was deprecated in C++17. While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library. For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D67586	2021-04-12 10:47:14 -07:00
Yuanfang Chen	f4d682d6ce	[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom D24453 enabled libcalls simplication for ARM PCS. This may cause caller/callee calling conventions mismatch in some situations such as LTO. This patch makes instcombine aware that the compatible calling conventions differences are benign (not emitting undef idom). Differential Revision: https://reviews.llvm.org/D99773	2021-04-12 09:32:23 -07:00
Stephen Tozer	f2e4f3eff3	Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" The causes of the previous build errors have been fixed in revisions `aa3e78a59f`, and `140757bfaa` This reverts commit `f40976bd01`.	2021-04-12 16:57:29 +01:00
Andrew Savonichev	f037b07b5c	Revert "[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant" This reverts commit `cca9b5985c`. Buildbot reported an error for CodeGen/AArch64/machine-combiner-fmul-dup.mir: * Bad machine code: Virtual register killed in block, but needed live out. * - function: indexed_2s - basic block: %bb.0 entry (0x640fee8) Virtual register %7 is used after the block. * Bad machine code: Virtual register defs don't dominate all uses. * - function: indexed_2s - v. register: %7 LLVM ERROR: Found 2 machine code errors.	2021-04-12 16:28:49 +03:00
Andrew Savonichev	cca9b5985c	[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant This patch adds DUP+FMUL => FMUL_indexed pattern to InstCombiner. FMUL_indexed is normally selected during instruction selection, but it does not work in cases when VDUP and VMUL are in different basic blocks. Differential Revision: https://reviews.llvm.org/D99662	2021-04-12 16:08:39 +03:00
Simon Pilgrim	199a21bd8c	[IR] Fix Wdocumentation warning. NFCI.	2021-04-12 11:20:57 +01:00
Zakk Chen	a8fc0e445c	[RISCV][Clang] Add all RVV Mask intrinsic functions. 1. Redefine vpopc and vfirst IR intrinsic so it could adapt on clang tablegen generator which always appends a type for vl in IntrinsicType of clang codegen. 2. Remove `c` type transformer and add `u` and `l` for unsigned long and long type. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100120	2021-04-11 19:19:02 -07:00
Roman Lebedev	9829f5e6b1	[CVP] @llvm.[us]{min,max}() intrinsics handling If we can tell that either one of the arguments is taken, bypass the intrinsic. Notably, we are indeed fine with non-strict predicate: * UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i * UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx * SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W * SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr Much like with all other comparison handling in CVP, while we could sort-of handle two Value's, at least for plain ICmpInst it does not appear to be worthwhile. This only fires 78 times on test-suite + dt + rs, but we don't canonicalize to these yet. (only SCEV produces them)	2021-04-11 00:33:47 +03:00
Wenlei He	00ef28ef21	[CSSPGO] Fix dangling context strings and improve profile order consistency and error handling This patch fixed the following issues along side with some refactoring: 1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer. 2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent. 3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile. 4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly. 5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch. Differential Revision: https://reviews.llvm.org/D100090	2021-04-10 12:39:10 -07:00
Roman Lebedev	257eda0794	[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue The default is likely wrong. Out of all the callees, only a single one needs to pass-in false (JumpThread), everything else either already passes true, or should pass true. Until the default is flipped, at least make it harder to unintentionally add new callees with UseBlockValue=false.	2021-04-10 20:46:01 +03:00
Roman Lebedev	03225969e3	[NFC] Rename LimitingIntrinsic into MinMaxIntrinsic As requested in post-commit review	2021-04-10 20:46:01 +03:00
Roman Lebedev	e8c7f43e2c	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:38:55 +03:00
Roman Lebedev	7b12c8c59d	Revert "[NFC][ConstantRange] Add 'icmp' helper method" This reverts commit `17cf2c9423`.	2021-04-10 19:37:53 +03:00
Roman Lebedev	8371dde485	Revert "zz" It wasn't meant to be committed, two commits should have been squashed. This reverts commit `0c18415496`.	2021-04-10 19:37:47 +03:00
Roman Lebedev	17cf2c9423	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:09:52 +03:00
Roman Lebedev	0c18415496	zz	2021-04-10 19:09:17 +03:00
dfukalov	8f4b7e94a2	[AMDGPU][CostModel] Refine cost model for control-flow instructions. Added cost estimation for switch instruction, updated costs of branches, fixed phi cost. Had to increase `-amdgpu-unroll-threshold-if` default value since conditional branch cost (size) was corrected to higher value. Test renamed to "control-flow.ll". Removed redundant code in `X86TTIImpl::getCFInstrCost()` and `PPCTTIImpl::getCFInstrCost()`. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D96805	2021-04-10 09:20:24 +03:00
Duncan P. N. Exon Smith	0db6488a77	Support: Add move semantics to mapped_file_region Update llvm::sys::fs::mapped_file_region to have a move constructor and a move assignment operator, allowing it to be used as an Optional. Also, update FileOutputBuffer's OnDiskBuffer to take advantage of this, avoiding an extra allocation from the unique_ptr. A nice follow-up would be to make the mapped_file_region constructor private and replace its use with a factory function, such as mapped_file_region::create(), that returns an Expected (or ErrorOr). I don't plan on doing that immediately, but I might swing back later. No functionality change, besides the saved allocation in OnDiskBuffer. Differential Revision: https://reviews.llvm.org/D100159	2021-04-09 17:56:26 -07:00
cchen	1a43fd2769	[OpenMP51] Initial support for masked directive and filter clause Adds basic parsing/sema/serialization support for the #pragma omp masked directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99995	2021-04-09 14:00:36 -05:00
Adrian Prantl	6ce76ff7eb	Update the linkage name of coro-split functions in the debug info. This patch updates the linkage name in the DISubprogram of coro-split functions, which is particularly important for Swift, where the funclets have a special name mangling. This patch does not affect C++ coroutines, since the DW_AT_specification is expected to hold the (original) linkage name. I believe this is mostly due to limitations in AsmPrinter, so we might be able to relax this restriction in the future. Differential Revision: https://reviews.llvm.org/D99693	2021-04-09 09:50:56 -07:00
dfukalov	c1a88e007b	[AA][NFC] Convert AliasResult to class containing offset for PartialAlias case. Add an ability to store `Offset` between partially aliased location. Use this storage within returned `ResultAlias` instead of caching it in `AAQueryInfo`. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98718	2021-04-09 13:26:09 +03:00
dfukalov	d066079728	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. Main reason is preparation to transform AliasResult to class that contains offset for PartialAlias case. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98027	2021-04-09 12:54:22 +03:00
Max Kazantsev	275f3a2540	[GVN][NFC] Factor out load elimination logic via PRE for reuse	2021-04-09 16:12:25 +07:00
Duncan P. N. Exon Smith	4a84b03ece	ADT: Sink the guts of StringMapEntry::Create into StringMapEntryBase Sink the interesting parts of StringMapEntry::Create into a new function StringMapEntryBase::allocateWithKey that's only templated on the allocator, taking the entry size and alignment as parameters. As dblaikie pointed out in the review, it'd be interesting as a follow-up to make this more generic and maybe sink at least some of it into a source file; I haven't done that yet myself, but I left behind an encouraging comment. Differential Revision: https://reviews.llvm.org/D95654	2021-04-08 17:57:47 -07:00
Duncan P. N. Exon Smith	6dc432510f	Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC, 3rd attempt This reverts commit `e35afbe535`, reapplying `022ccedde8` and `e7ed5c920d`. - The first attempt missed defining `SignpostEmitterImpl`. - The second attempt missed defining `llvm::SignpostEmitterImpl`. Not sure how I failed to test both versions locally before; I thought I'd turned the feature off via rerunning `cmake` but it must have been stuck in place. This time I confirmed via `clang -E` that I was testing both build configurations. Original commit message: Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 17:05:59 -07:00
Duncan P. N. Exon Smith	e35afbe535	Revert "Revert "Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC""" This reverts commit `e7ed5c920d` again, due to more buildbot failures: https://lab.llvm.org/buildbot/#/builders/131/builds/8191	2021-04-08 16:58:12 -07:00
Duncan P. N. Exon Smith	e7ed5c920d	Revert "Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC"" This reverts commit `078072285d`, reapplying `022ccedde8`. I figured out why this was failing in other environments: it's not a problem with std::unique_ptr, but that SignpostEmitterImpl only has a forward declaration. Adding an empty definition should do the trick. Original commit message: Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 16:50:39 -07:00
Duncan P. N. Exon Smith	078072285d	Revert "Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC" This reverts commit `022ccedde8`. Looks like some hosts need a definition of SignpostEmitterImpl to put it in a unique_ptr: https://lab.llvm.org/buildbot/#/builders/92/builds/7733	2021-04-08 16:38:47 -07:00
Duncan P. N. Exon Smith	022ccedde8	Support: Use std::unique_ptr for SignpostEmitter::Impl, NFC Replace some manual memory management with std::unique_ptr. Differential Revision: https://reviews.llvm.org/D100151	2021-04-08 16:31:59 -07:00
Duncan P. N. Exon Smith	429088b9e2	Support: Extract fs::resize_file_before_mapping_readwrite from FileOutputBuffer Add a variant of `fs::resize_file` for use immediately before opening a file with `mapped_file_region::readwrite`. On Windows, `_chsize` (`ftruncate`) is slow, but `CreateFileMapping` (`mmap`) automatically extends the file so the call to `fs::resize_file` can be skipped. This optimization was added to `FileOutputBuffer` in da9bc2e56d5a5c6332a9def1a0065eb399182b93; this commit just extracts the logic out and adds a unit test. Differential Revision: https://reviews.llvm.org/D95490	2021-04-08 16:26:35 -07:00
Arthur Eubanks	c5d1ccbcdf	[GVN] Properly invalidate ICF cache when we simplify a value This fixes a "Cached first special instruction is wrong!" assert. The assert fires because replacing a value with another can cause an instruction to no longer be "special" to ICF. In this case, devirtualization happened, turning an indirect call to a call to a willreturn function which is no longer special. Reviewed By: nikic, rnk Differential Revision: https://reviews.llvm.org/D99977	2021-04-08 14:01:57 -07:00
Paul C. Anagnostopoulos	3f919ff250	Revert "[TableGen] Add support for the 'assert' statement in multiclasses" This reverts commit `3b9a15d910`.	2021-04-08 13:58:58 -04:00
Paul C. Anagnostopoulos	3b9a15d910	[TableGen] Add support for the 'assert' statement in multiclasses	2021-04-08 08:36:03 -04:00
Stephen Tozer	140757bfaa	[DebugInfo] Prevent invalid debug info being produced during LoopStrengthReduce During LoopStrengthReduce, some of the SSA values that are used by debug values may be lost and/or salvaged. After LSR we attempt to recover any undef debug values, including any that were salvaged but then lost their values afterwards, by replacing the lost values with any live equal values (plus a possible constant offset) that have been gathered prior to running LSR. When we do this we restore the debug value's original DIExpression, to undo any salvaging (as we have gone back to using the original debug value). This process can currently produce invalid debug info if the number of operands has changed by salvaging during LSR. Replacing old values during the applyEqualValues step does not change the number of location operands, which means that when we restore the old DIExpression we may have a mismatch between the number of operands used by the debug value and the number of operands referenced by the DIExpression. This patch fixes this by restoring the full original location metadata at the start of the applyEqualValues step, so that there is no mismatch in operand count between the debug value and its DIExpression. Differential Revision: https://reviews.llvm.org/D98644	2021-04-08 13:04:48 +01:00
Arthur Eubanks	90af134473	Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()" This reverts commit `9583a3f262`. This wasn't NFC as initially thought. Needed for D99707.	2021-04-07 11:40:44 -07:00
LemonBoy	03f7b13d44	[X86] Initialize TargetOptions::StackProtectorGuardOffset member to its default value D88631 introduced a set of knobs to tweak how the stack protector is codegen'd for x86 targets, including the offset from the base register where the stack cookie is located. The `StackProtectorGuardOffset` field in `TargetOptions` was left uninitialized instead of being reset to its neutral value -1, making it possible to emit nonsensical code if the frontend doesn't change the field value at all before feeding the `TargetOptions` to the target machine initializer. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D99952	2021-04-07 09:04:25 +02:00
Jonas Devlieghere	5d07dc8977	[dsymutil] Don't emit .debug_pubnames and .debug_pubtypes Consider the .debug_pubnames and .debug_pubtypes their own kind of accelerator and stop emitting them together with the Apple-style accelerator tables. The only reason we were still emitting both was for (byte-for-byte) compatibility with dsymutil-classic. - This patch adds a new accelerator table kind "Pub" which can be specified with --accelerator=Pub. - This patch removes the ability to emit both pubnames/types and apple style accelerator tables. I don't think anyone is relying on that but it's worth pointing out. - This patch removes the --minimize option and makes this behavior the default. Specifying the flag will result in a warning but won't abort the program. Differential revision: https://reviews.llvm.org/D99907	2021-04-06 19:01:45 -07:00
Nicolás Alvarez	a1aada75f5	[docs] Fix doxygen comments wrongly attached to the llvm namespace Looking at the Doxygen-generated documentation for the llvm namespace currently shows all sorts of random comments from different parts of the codebase. These are mostly caused by: - File doc comments that aren't marked with \file, so they're attached to the next declaration, which is usually "namespace llvm {". - Class doc comments placed before the namespace rather than before the class. - Code comments before the namespace that (in my opinion) shouldn't be extracted by doxygen at all. This commit fixes these comments. The generated doxygen documentation now has proper docs for several classes and files, and the docs for the llvm and llvm::detail namespaces are now empty. Reviewed By: thakis, mizvekov Differential Revision: https://reviews.llvm.org/D96736	2021-04-07 01:20:18 +02:00
Craig Topper	0d6dd23ca9	[MachineValueTypes] Add blank lines between floating point vectors with different element types. NFC	2021-04-06 14:51:56 -07:00
Sidharth Baveja	d81d9e8b86	[SplitEdge] Update SplitCriticalEdge to return a nullptr only when the edge is not critical Summary: The function SplitCriticalEdge (called by SplitEdge) can return a nullptr in cases where the edge is a critical. SplitEdge uses SplitCriticalEdge assuming it can always split all critical edges, which is an incorrect assumption. The three cases where the function SplitCriticalEdge will return a nullptr is: 1. DestBB is an exception block 2. Options.IgnoreUnreachableDests is set to true and isa(DestBB->getFirstNonPHIOrDbgOrLifetime()) is not equal to a nullptr 3. LoopSimplify form must be preserved (Options.PreserveLoopSimplify is true) and it cannot be maintained for a loop due to indirect branches For each of these situations they are handled in the following way: 1. Modified the function ehAwareSplitEdge originally from llvm/lib/Transforms/Coroutines/CoroFrame.cpp to handle the cases when the DestBB is an exception block. This function is called directly in SplitEdge. SplitEdge does not call SplitCriticalEdge in this case 2. Options.IgnoreUnreachableDests is set to false by default, so this situation does not apply. 3. Return a nullptr in this situation since the SplitCriticalEdge also returned nullptr. Nothing we can do in this case. Reviewed By: asbirlea Differential Revision:https://reviews.llvm.org/D94619	2021-04-06 21:24:40 +00:00
Philip Reames	9ef6aa020b	Plumb AssumeInst through operand bundle apis [nfc] Follow up to `a6d2a8d6f5`. This covers all the public interfaces of the bundle related code. I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.	2021-04-06 12:53:53 -07:00
Philip Reames	a6d2a8d6f5	Add a subclass of IntrinsicInst for llvm.assume [nfc] Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.	2021-04-06 11:16:22 -07:00
Philip Reames	21d4839948	Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc] These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.	2021-04-06 08:33:15 -07:00
Kerry McLaughlin	7344f3d39a	[LoopVectorize] Add strict in-order reduction support for fixed-width vectorization Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to reorder FP operations. However, it may still be beneficial to vectorize the loop by moving the reduction inside the vectorized loop and making sure that the scalar reduction value be an input to the horizontal reduction, e.g: %phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ] %load = load <8 x float> %reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load) This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions. For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D98435	2021-04-06 14:45:34 +01:00
Abhina Sreeskantharajan	82b3e28e83	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text Problem: On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable. Solution: This patch adds two new flags - OF_CRLF which indicates that CRLF translation is used. - OF_TextWithCRLF = OF_Text \| OF_CRLF indicates that the file is text and uses CRLF translation. Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF. So this is the behaviour per platform with my patch: z/OS: OF_None: open in binary mode OF_Text : open in text mode OF_TextWithCRLF: open in text mode Windows: OF_None: open file with no carriage return OF_Text: open file with no carriage return OF_TextWithCRLF: open file with carriage return The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set. ``` if (Flags & OF_CRLF) CrtOpenFlags \|= _O_TEXT; ``` These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows. ./llvm/lib/Support/raw_ostream.cpp ./llvm/lib/TableGen/Main.cpp ./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp ./llvm/unittests/Support/Path.cpp ./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp ./clang/lib/Frontend/CompilerInstance.cpp ./clang/lib/Driver/Driver.cpp ./clang/lib/Driver/ToolChains/Clang.cpp Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99426	2021-04-06 07:23:31 -04:00
Kerry McLaughlin	857b8a73da	[LoopVectorize] Change the identity element for FAdd Changes getRecurrenceIdentity to always return a neutral value of -0.0 for FAdd. Reviewed By: dmgreen, spatel Differential Revision: https://reviews.llvm.org/D98963	2021-04-06 12:13:43 +01:00
Simon Pilgrim	ddbb58736a	[KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI. As promised in D98866	2021-04-06 10:11:41 +01:00
Yevgeny Rouban	39e3e3aa51	[NewPM] Redesign of PreserveCFG Checker The reason for the NewPM redesign is described in the commit cba3e783389a: [NewPM] Disable PreservedCFGChecker ... The checker introduces an internal custom CFG analysis that tracks current up-to date CFG snapshot. The analysis is invalidated along any other CFG related analysis (the key is CFGAnalyses). If the CFG analysis is not invalidated at a functional pass exit then the checker asserts that the CFG snapshot taken from this analysis is equals to a snapshot of the current CFG. Along the way: - the function CFG::printDiff() is simplified by removing function name calculation. The name is printed by the caller; - fixed CFG invalidated condition (see CFG::invalidate()); - StandardInstrumentations::registerCallbacks() gets additional optional parameter of type FunctionAnalysisManager*, which is needed by the checker to get the custom CFG analysis; - several PM related tests updated to explicitly set -verify-cfg-preserved=1 as they need. This patch is safe to land as the CFGChecker is left switched off (the options -verify-cfg-preserved is false by default). It will be switched on by a separate patch to minimize possible reverts. Reviewed By: skatkov, kuhar Differential Revision: https://reviews.llvm.org/D91327	2021-04-06 12:35:49 +07:00
Serguei Katkov	0057ec8034	[Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D99875	2021-04-06 11:44:37 +07:00
Arthur Eubanks	ea0e2ca1ac	[SROA] Allow SROA on pointers with invariant group intrinsic uses When we are able to SROA an alloca, we know all uses of it, meaning we don't have to preserve the invariant group intrinsics and metadata. It's possible that we could lose information regarding redundant loads/stores, but that's unlikely to have any real impact since right now the only user is Clang and vtables. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99760	2021-04-05 19:53:40 -07:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Ricky Taylor	4db18d62af	[M68k] Add support for Motorola literal syntax to AsmParser These look like $00A0cf for hex and %001010101 for binary. They are used in Motorola assembly syntax. Differential Revision: https://reviews.llvm.org/D98519	2021-04-05 20:02:29 +01:00
Jennifer Yu	7078ef4722	[OPENMP51]Initial support for nocontext clause. Added basic parsing/sema/serialization support for the 'nocontext' clause. Differential Revision: https://reviews.llvm.org/D99848	2021-04-05 11:45:49 -07:00
Cyndy Ishida	0116d04d04	[TextAPI] move source code files out of subdirectory, NFC TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a need to seperate out TextAPI between formats. Reviewed By: ributzka, int3, #lld-macho Differential Revision: https://reviews.llvm.org/D99811	2021-04-05 10:24:42 -07:00
Alexey Bataev	00a84f9a7f	[SLP]Improve vectorization of the CmpInst instructions. During vectorization better to postpone the vectorization of the CmpInst instructions till the end of the basic block. Otherwise we may vectorize it too early and may miss some vectorization patterns, like reductions. Reworked part of D57059 Differential Revision: https://reviews.llvm.org/D99796	2021-04-05 06:22:51 -07:00
Alex Orlov	5f57793c4f	* NFC. Refactored DIPrinter for better support of new print styles. This patch introduces a DIPrinter interface to implement by different output style printer implementations. DIPrinterGNU and DIPrinterLLVM implement the GNU and LLVM output style printing respectively. No functional changes. This refactoring clarifies and simplifies the code, and makes a new output style addition easier. Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D98994	2021-04-05 15:40:41 +04:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Nikita Popov	9d20eaf9c0	[BasicAA] Don't store AATags in cache key (NFC) The AAMDNodes part of the MemoryLocation is not used by the BasicAA cache, so don't store it. This reduces the size of each cache entry from 112 bytes to 48 bytes.	2021-04-03 11:32:01 +02:00
Nikita Popov	17b4e5d456	[BasicAA] Don't pass through AA metadata (NFCI) BasicAA itself doesn't make use of AA metadata, but passes it through to recursive queries and makes it part of the cache key. Aliasing decisions that are based on AA metadata (i.e. TBAA and ScopedAA) are based only on AA metadata, so checking them with different pointer values or sizes is not useful, the result will always be the same. While this change is a mild compile-time improvement by itself, the actual goal here is to reduce the size of AA cache keys in a followup change. Differential Revision: https://reviews.llvm.org/D90098	2021-04-03 11:21:50 +02:00
Simon Pilgrim	4ea5475a3f	[KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI. Include exhaustive test coverage.	2021-04-02 21:44:33 +01:00
Jennifer Yu	cb424fee3d	[OPENMP5.1]Initial support for novariants clause. Added basic parsing/sema/serialization support for the 'novariants' clause.	2021-04-02 13:19:01 -07:00
Jan Kratochvil	942cf22565	[nfc] [llvm] Make DWARFListTableBase::findList const Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99731	2021-04-02 21:41:47 +02:00
Levy Hsu	f78d932cf2	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Cyndy Ishida	3a223cd4f3	[TextAPI] run clang-format on violating sections, NFC	2021-04-02 11:44:33 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Brendon Cahoon	09a88278cb	[GlobalISel] Allow different types for G_SBFX and G_UBFX operands Change the definition of G_SBFX and G_UBFX so that the lsb and width can have different types than the src and dst operands. Differential Revision: https://reviews.llvm.org/D99739	2021-04-02 11:11:06 -04:00
Sander de Smalen	0f7bbbc481	Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed. In order to bring up scalable vector support in LLVM incrementally, we introduced behaviour to emit a warning, instead of an error, when asking the wrong question of a scalable vector, like asking for the fixed number of elements. This patch puts that behaviour under a flag. The default behaviour is that the compiler will always error, which means that all LLVM unit tests and regression tests will now fail when a code-path is taken that still uses the wrong interface. The behaviour to demote an error to a warning can be individually enabled for tools that want to support experimental use of scalable vectors. This patch enables that behaviour when driving compilation from Clang. This means that for users who want to try out scalable-vector support, fixed-width codegen support, or build user-code with scalable vector intrinsics, Clang will not crash and burn when the compiler encounters such a case. This allows us to do away with the following pattern in many of the SVE tests: RUN: .... 2>%t RUN: cat %t \| FileCheck --check-prefix=WARN WARN-NOT: warning: ... The behaviour to emit warnings is only temporary and we expect this flag to be removed in the future when scalable vector support is more stable. This patch also has fixes the following tests: unittests: ScalableVectorMVTsTest.SizeQueries SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG regression tests: Transforms/InstCombine/vscale_gep.ll Reviewed By: paulwalker-arm, ctetreau Differential Revision: https://reviews.llvm.org/D98856	2021-04-02 10:55:22 +01:00
Evgeniy Brevnov	2388aae401	[NARY-REASSOCIATE] Support reassociation of min/max Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available. Reviewed By: mkazantsev, lebedev.ri Differential Revision: https://reviews.llvm.org/D88287	2021-04-02 15:30:13 +07:00
Daniel Rodríguez Troitiño	f5c9db97a8	[TextAPI] Add support for arm64_32 Add a new architecture definition for arm64_32. The change should allow the new architecture arm64_32 to be recognized in several pieces of code, TextAPI parsing one of them. llvm-lipo will also recognize the architecture and will allow lipoing files with this architecture without failing. Includes a small test that the architecture is recognized by llvm-nm. Reviewed By: cishida Differential Revision: https://reviews.llvm.org/D99673	2021-04-01 17:19:12 -07:00
cchen	cba422264c	[OpenMP51] Accept `primary` as proc bind affinity policy in Clang Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99622	2021-04-01 18:07:12 -05:00
Philip Reames	ffa15e9463	Extract isVolatile helper on Instruction [NFCI] We have this logic duplicated in several cases, none of which were exhaustive. Consolidate it in one place. I don't believe this actually impacts behavior of the callers. I think they all filter their inputs such that their partial implementations were correct. If not, this might be fixing a cornercase bug.	2021-04-01 11:24:02 -07:00
Philip Reames	6b05d753e0	Mark unordered memset/memmove/memcpy as nosync Mostly a means to remove a bit of code from attributor in advance of implementing a FuncAttr inference for nosync.	2021-04-01 10:38:54 -07:00
Philip Reames	e2c6621e63	[deref-at-point] restrict inference of dereferenceability based on allocsize attribute Support deriving dereferenceability facts from allocation sites with known object sizes while correctly accounting for any possibly frees between allocation and use site. (At the moment, we're conservative and only allowing it in functions where we know we can't free.) This is part of the work on deref-at-point semantics. I'm making the change unconditional as the miscompile in this case is way too easy to trip by accident, and the optimization was only recently added (by me). There will be a follow up patch wiring through TLI since that should now be doable without introducing widespread miscompiles. Differential Revision: https://reviews.llvm.org/D95815	2021-04-01 08:34:40 -07:00
Mircea Trofin	ce61def529	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-04-01 08:33:28 -07:00
Anirudh Prasad	7b921a6747	[AsmParser][SystemZ][z/OS] Add in support to accept "#" as part of an Identifier token - This patch adds in support to accept the "#" character as part of an Identifier. - This support is needed especially for the HLASM dialect since "#" is treated as part of the valid "Alphabet" range - The way this is done is by making use of the previous precedent set by the `AllowAtInIdentifier` field in `MCAsmLexer.h`. A new field called `AllowHashInIdentifier` is introduced. - The static function `IsIdentifierChar` is also updated to accept the `#` character if the `AllowHashInIdentifier` field is set to true. Note: The field introduced in `MCAsmLexer.h` could very well be moved to `MCAsmInfo.h`. I'm not opposed to it. I decided to put it in `MCAsmLexer` since there seems to be some sort of precedent already with `AllowAtInIdentifier`. Reviewed By: abhina.sreeskantharajan, nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D99277	2021-04-01 11:24:43 -04:00
Max Kazantsev	630818a850	[NFC] Disambiguate LI in GVN Name GVN uses name 'LI' for two different unrelated things: LoadInst and LoopInfo. This patch relates the variables with former meaning into 'Load' to disambiguate the code.	2021-04-01 12:40:35 +07:00
Chen Zheng	bfcd21876a	[debug-info] support new tuning debugger type DBX for XCOFF DWARF Based on this debugger type, for now, we plan to: 1: use inline string by default for XCOFF DWARF 2: generate no column info for debug line table. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99400	2021-04-01 00:11:30 -04:00
Philip Reames	115a42ad1e	Add debug printers for KnownBits [nfc]	2021-03-31 15:36:07 -07:00
Thomas Lively	45783d0e8a	[WebAssembly] Implement i64x2 comparisons Removes the prototype builtin and intrinsic for i64x2.eq and implements that instruction as well as the other i64x2 comparison instructions in the final SIMD spec. Unsigned comparisons were not included in the final spec, so they still need to be scalarized via a custom lowering. Differential Revision: https://reviews.llvm.org/D99623	2021-03-31 10:46:17 -07:00
Wael Yehia	563cdeaafd	[LTO][Legacy] Decouple option parsing from LTOCodeGenerator in this patch we add a new libLTO API to specify debug options independent of an lto_code_gen_t. This allows clients to pass codegen flags (through libLTO) which otherwise today are ignored. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D92611	2021-03-31 16:43:26 +00:00
Sander de Smalen	b6d0529780	[CostModel] Align the cost model for intrinsics for scalable/fixed-width vectors. Let getIntrinsicInstrCost call getTypeBasedIntrinsicInstrCost for scalable vectors, similar to how this is done for fixed-width vectors, instead of falling back on BaseT::getIntrinsicInstrCost(). If the intrinsic cannot be costed (or is not overloaded by the target), it will return InstructionCost::getInvalid() instead. Depends on D97469 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D97470	2021-03-31 14:52:49 +01:00
Sander de Smalen	2f6f249a49	NFC: Change getIntrinsicInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97468 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97469	2021-03-31 14:04:41 +01:00
Sander de Smalen	2f56e1c6b1	NFC: Change getTypeBasedIntrinsicCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97466 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97468	2021-03-31 14:04:41 +01:00
Sander de Smalen	3ccbd4f3c7	NFC: Change getUserCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97382 Reviewed By: ctetreau, paulwalker-arm Differential Revision: https://reviews.llvm.org/D97466	2021-03-31 10:13:09 +01:00
Lang Hames	ec235dd355	[JITLink] Delete copy and move constructors for jitlink::Section. Sections are not movable or copyable.	2021-03-30 22:58:14 -07:00
Lang Hames	0269a407f3	[JITLink] Switch from StringRef to ArrayRef<char>, add some generic x86-64 utils Adds utilities for creating anonymous pointers and jump stubs to x86_64.h. These are used by the GOT and Stubs builder, but may also be used by pass writers who want to create pointer stubs for indirection. This patch also switches the underlying type for LinkGraph content from StringRef to ArrayRef<char>. This avoids any confusion when working with buffers that contain null bytes in the middle like, for example, a newly added null pointer content array. ;)	2021-03-30 21:07:24 -07:00
Chuanqi Xu	eb51dd719f	[Coroutine] [Debug] Insert dbg.declare to entry.resume to print alloca in the coroutine frame under O2 Summary: Try to insert dbg.declare to entry.resume basic block in resume function. In this way, we could print alloca such as __promise in gdb/lldb under O2, which would be beneficial to debug coroutine program. Test Plan: check-llvm Reviewed by: aprantl Differential Revision: https://reviews.llvm.org/D96938	2021-03-31 10:37:06 +08:00
Lang Hames	3a83b8b2d2	[JITLink] Add a setProtectionFlags method to jitlink::Section. This allows clients to modify the memory protection settings on sections via jitlink passes. This can be used to, for example, override the default settings on text pages and make them Read/Write/Executable under the JIT.	2021-03-30 17:56:29 -07:00
Craig Topper	f59ba0849f	[StructLayout] Use TrailingObjects to allocate space for MemberOffsets. MemberOffsets are stored at the end of StructLayout. The class contains a single entry array to mark the start of the member offsets. getStructLayout calculates the additional space needed for additional elements before allocating memory. This patch converts this to use TrailingObjects. This simplifies the size computation in getStructLayout and gets rid of the single entry array. This is prep work, but to use TypeSize instead of uint64_t for D98169. The single entry array doesn't work with TypeSize because TypeSize doesn't have a default constructor. We thought this change was an improvement by itself so we've separated it out. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D99608	2021-03-30 17:36:50 -07:00
Wei Mi	d535a05ca1	[ThinLTO] During module importing, close one source module before open another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554	2021-03-30 14:37:29 -07:00
Mike Rice	b7899ba0e8	[OPENMP51]Initial support for the dispatch directive. Added basic parsing/sema/serialization support for dispatch directive. Differential Revision: https://reviews.llvm.org/D99537	2021-03-30 14:12:53 -07:00
Amara Emerson	a35c2c7942	[GlobalISel] Implement fewerElements legalization for vector reductions. This patch adds 3 methods, one for power-of-2 vectors which use tree reductions using vector ops, before a final reduction op. For non-pow-2 types it generates multiple narrow reductions and combines the values with scalar ops. Differential Revision: https://reviews.llvm.org/D97163	2021-03-30 11:19:21 -07:00
Amara Emerson	91887cd4ec	[AArch64][GlobalISel] Combine funnel shifts to rotates. Differential Revision: https://reviews.llvm.org/D99388	2021-03-30 11:00:36 -07:00
Sourabh Singh Tomar	f13f050551	[DebugInfo] Support for signed constants inside DIExpression Negative numbers are represented using DW_OP_consts along with signed representation of the number as the argument. Test case IR is generated using Fortran front-end. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99273	2021-03-30 23:20:38 +05:30
Hongtao Yu	3e3fc431df	[CSSPGO] Top-down processing order based on full profile. Use profiled call edges to augment the top-down order. There are cases that the top-down order computed based on the static call graph doesn't reflect real execution order. For example: 1. Incomplete static call graph due to unknown indirect call targets. Adjusting the order by considering indirect call edges from the profile can enable the inlining of indirect call targets by allowing the caller processed before them. 2. Mutual call edges in an SCC. The static processing order computed for an SCC may not reflect the call contexts in the context-sensitive profile, thus may cause potential inlining to be overlooked. The function order in one SCC is being adjusted to a top-down order based on the profile to favor more inlining. 3. Transitive indirect call edges due to inlining. When a callee function is inlined into into a caller function in LTO prelink, every call edge originated from the callee will be transferred to the caller. If any of the transferred edges is indirect, the original profiled indirect edge, even if considered, would not enforce a top-down order from the caller to the potential indirect call target in LTO postlink since the inlined callee is gone from the static call graph. 4. #3 can happen even for direct call targets, due to functions defined in header files. Header functions, when included into source files, are defined multiple times but only one definition survives due to ODR. Therefore, the LTO prelink inlining done on those dropped definitions can be useless based on a local file scope. More importantly, the inlinee, once fully inlined to a to-be-dropped inliner, will have no profile to consume when its outlined version is compiled. This can lead to a profile-less prelink compilation for the outlined version of the inlinee function which may be called from external modules. while this isn't easy to fix, we rely on the postlink AutoFDO pipeline to optimize the inlinee. Since the survived copy of the inliner (defined in headers) can be inlined in its local scope in prelink, it may not exist in the merged IR in postlink, and we'll need the profiled call edges to enforce a top-down order for the rest of the functions. Considering those cases, a profiled call graph completely independent of the static call graph is constructed based on profile data, where function objects are not even needed to handle case #3 and case 4. I'm seeing an average 0.4% perf win out of SPEC2017. For certain benchmark such as Xalanbmk and GCC, the win is bigger, above 2%. The change is an enhancement to https://reviews.llvm.org/D95988. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D99351	2021-03-30 10:42:22 -07:00
Jessica Paquette	700431128e	[GlobalISel][AArch64] Combine G_SEXT_INREG + right shift -> G_SBFX Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG. This is only done post-legalization for now. Once the legalizer knows how to decompose these back into shifts, this requirement can probably be removed. Differential Revision: https://reviews.llvm.org/D99230	2021-03-30 10:14:30 -07:00
Amara Emerson	f5e9be6fdb	[GlobalISel] Implement lowering for G_ROTR and G_ROTL. This is a straightforward port. Differential Revision: https://reviews.llvm.org/D99449	2021-03-30 09:44:41 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Craig Topper	292816d2b6	[RISCV] Don't set the SplatOperand flag on intrinsics that take a shift amount. The shift amount should always be a vector or an XLen scalar. The SplatOperand flag is used to indicate we need to legalize non-XLen scalars including special handling for i64 on RV32. This will prevent us from silently adjusting these operands if the intrinsics are misused. I'll probably adjust the name of the SplatOperand flag slightly in a follow up patch. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D99545	2021-03-30 09:23:36 -07:00
Krasimir Georgiev	c51e91e046	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `5178ffc7cf`. Compiling `llvm-profdata` with a compiler build from this produces a crashing binary.	2021-03-30 14:13:37 +02:00
Sander de Smalen	f71ed5dfe2	NFC: Migrate PartialInlining to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D97382	2021-03-30 11:59:45 +01:00
Sander de Smalen	4ca860742d	[InstructionCost] Don't conflate Invalid costs with Unknown costs. We previously made a change to getUserCost to return a Invalid cost when one of the TTI costs returned '-1' (meaning 'unknown' or 'infinitely expensive'). It makes no sense to say that: shufflevector <2 x i8> %x, <2 x i8> %y, <4 x i32> <i32 0, i32 1, i32 2, i32 3> has an invalid cost. Perhaps the cost is not known, but the IR is valid and can be code-generated. Invalid should only be used for IR that cannot possibly be code-generated and where a cost is nonsensical. With more passes now asserting that the cost must be valid, it is possible that those assertions will fail for perfectly valid IR. An incomplete cost-model probably shouldn't be a reason for the compiler to break. It's better to consider these costs as 'very expensive' and ignore them for other reasons. At some point, we should consider replacing -1 with some other mechanism. Reviewed By: paulwalker-arm, dmgreen Differential Revision: https://reviews.llvm.org/D99502	2021-03-30 09:29:42 +01:00
Stefan Gränitz	c352a2b829	[lli] Add option -lljit-platform=Inactive to disable platform support explicitly This option tells LLJIT to disable platform support explicitly: JITDylibs aren't scanned for special init/deinit symbols and no runtime API interposes are injected. It's useful in two cases: for platforms that don't have such requirements and platforms for which we have no explicit support yet and that don't work well with the generic IR platform. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D99416	2021-03-30 09:29:45 +02:00
Alok Kumar Sharma	9fb0025f70	[DebugInfo] Upgrade DISubragne::count to accept DIExpression also This is needed for Fortran assumed shape arrays whose dimensions are defined as, - 'count' is taken from array descriptor passed as parameter by caller, access from descriptor is defined by type DIExpression. - 'lowerBound' is defined by callee. The current alternate way represents using upperBound in place of count, where upperBound is calculated in callee in a temp variable using lowerBound and count Representation with count (DIExpression) is not only clearer as compared to upperBound (DIVariable) but it has another advantage that variable count is accessed by being parameter has better chance of survival at higher optimization level than upperBound being local variable. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99335	2021-03-30 09:16:55 +05:30
Adrian Prantl	8573c28a51	Add debug support for set types This commit adds debugging support for set types defined in languages such as Pascal and Modula-2. Patch by Peter McKinna! Differential Revision: https://reviews.llvm.org/D76115	2021-03-29 18:04:48 -07:00
Huihui Zhang	ca721042f1	[IPO][SampleContextTracker] Use SmallVector to track context profiles to prevent non-determinism. Use SmallVector instead of SmallSet to track the context profiles mapped. Doing this can help avoid non-determinism caused by iterating over unordered containers. This bug was found with reverse iteration turning on, --extra-llvm-cmake-variables="-DLLVM_REVERSE_ITERATION=ON". Failing LLVM test profile-context-tracker-debug.ll . Reviewed By: MaskRay, wenlei Differential Revision: https://reviews.llvm.org/D99547	2021-03-29 16:37:10 -07:00
Gulfem Savrun Yeniceri	5178ffc7cf	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-29 21:53:32 +00:00
Nico Weber	f53dc06ed3	fix comment typo to cycle bots	2021-03-29 15:47:16 -04:00
Wenlei He	30b0232336	[CSSPGO][llvm-profgen] Context-sensitive global pre-inliner This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen. It will serve two purposes: 1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size. 2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation. Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler. This change only has the framework, with a few TODOs left for follow up patches for a complete implementation: 1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation. 2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR. Differential Revision: https://reviews.llvm.org/D99146	2021-03-29 09:46:14 -07:00
Wei Mi	3cbf44190b	[SampleFDO] Do not scale the magic number NOMORE_ICP_MAGICNUM in value profile during profile update. When we inline a function and update the profile, the value profiles of the indirect call in the inliner and inlinee will be scaled. In https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we start using the magic number NOMORE_ICP_MAGICNUM (-1) to mark targets which have been promoted. The magic number shouldn't be scaled during the profile update. Although the problem has been suppressed by https://reviews.llvm.org/D98187 for SampleFDO, which stops profile update for inlining in sampleFDO, the patch is still wanted since it will be more consistent to handle the magic number properly in profile update. Differential Revision: https://reviews.llvm.org/D99394	2021-03-29 09:34:37 -07:00
Florian Hahn	c773d0f973	Recommit "[LV] Move runtime pointer size check to LVP::plan()." Re-apply `25fbe803d4`, with a small update to emit the right remark class. Original message: [LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri	2021-03-29 16:14:27 +01:00
Bradley Smith	9745dce8c3	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Florian Hahn	485c8ce733	Revert "[LV] Move runtime pointer size check to LVP::plan()." This reverts commit `25fbe803d4`. This breaks a clang test which filters for the wrong remark type.	2021-03-29 14:41:53 +01:00
Paul C. Anagnostopoulos	5f473a04af	[TableGen] Add support for the 'assert' statement in class definitions. Differential Revision: https://reviews.llvm.org/D99275	2021-03-29 09:20:29 -04:00
Florian Hahn	25fbe803d4	[LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98634	2021-03-29 14:12:29 +01:00
Matt Arsenault	9a0c9402fa	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `07e46367ba`.	2021-03-29 08:55:30 -04:00
Jingu Kang	e4abb64100	[LoopUnswitch] Use reference variables instead of pointer one Differential Revision: https://reviews.llvm.org/D99496	2021-03-29 13:08:46 +01:00
Oliver Stannard	07e46367ba	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit `fc9df30991`.	2021-03-29 11:32:22 +01:00
Jingu Kang	cfe87d4edd	[NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils Differential revision: https://reviews.llvm.org/D99490	2021-03-29 10:29:45 +01:00
Lang Hames	666df2e2cb	[ORC][C-bindings] Fix some ORC C bindings function names and signatures. LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not have matching signatures between the C-API header and binding implementations. Fixes http://llvm.org/PR49745. Patch by Mats Larsen. Thanks Mats! Reviewed by: lhames Differential Revision: https://reviews.llvm.org/D99478	2021-03-28 16:30:47 -07:00
Matt Arsenault	fc9df30991	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `20d5c42e0e`.	2021-03-28 13:35:21 -04:00
Nico Weber	20d5c42e0e	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `4fefed6563`. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Matt Arsenault	4fefed6563	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Florian Hahn	eb3d9f2eb6	[SelDag] Add isIntOrFPConstant helper function. This patch adds a new isIntOrFPConstant helper function to check if a SDValue is a integer of FP constant. This pattern is used in various places. There also are places that incorrectly just check for integer constants, e.g. D99384, so hopefully this helper will help people avoid that issue. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D99428	2021-03-28 12:48:58 +01:00
Nikita Popov	9075864b73	[BasicAA] Refactor linear expression decomposition The current linear expression decomposition handles zext/sext by decomposing the casted operand, and then checking NUW/NSW flags to determine whether the extension can be distributed. This has some disadvantages: First, it is not possible to perform a partial decomposition. If we have zext((x + C1) +<nuw> C2) then we will fail to decompose the expression entirely, even though it would be safe and profitable to decompose it to zext(x + C1) +<nuw> zext(C2) Second, we may end up performing unnecessary decompositions, which will later be discarded because they lack nowrap flags necessary for extensions. Third, correctness of the code is not entirely obvious: At a high level, we encounter zext(x -<nuw> C) in the form of a zext on the linear expression x + (-C) with nuw flag set. Notably, this case must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C). The code handles this correctly by speculatively zexting constants to the final bitwidth, and performing additional fixup if the actual extension turns out to be an sext. This was not immediately obvious to me. This patch inverts the approach: An ExtendedValue represents a zext(sext(V)), and linear expression decomposition will try to decompose V further, either by absorbing another sext/zext into the ExtendedValue, or by distributing zext(sext(x op C)) over a binary operator with appropriate nsw/nuw flags. At each step we can determine whether distribution is legal and abort with a partial decomposition if not. We also know which extensions we need to apply to constants, and don't need to speculate or fixup.	2021-03-27 23:31:58 +01:00
Juneyoung Lee	05884d3b52	Make FoldBranchToCommonDest poison-safe by default This is a small patch to make FoldBranchToCommonDest poison-safe by default. After `fc3f0c9c`, only two syntactic changes are needed to fix unit tests. This does not cause any assembly difference in testsuite as well (-O3, X86-64 Manjaro). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99452	2021-03-27 19:05:12 +09:00
Amara Emerson	55533203d7	[GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates. Differential Revision: https://reviews.llvm.org/D99383	2021-03-25 17:23:30 -07:00
Jessica Paquette	23f657c165	[AArch64][GlobalISel] Emit bzero on Darwin Darwin platforms for both AArch64 and X86 can provide optimized `bzero()` routines. In this case, it may be preferable to use `bzero` in place of a memset of 0. This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can be generated by platforms which may want to use bzero. To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The conditions for this are largely a port of the bzero case in `AArch64SelectionDAGInfo::EmitTargetCodeForMemset`. The only difference in comparison to the SelectionDAG code is that, when compiling for minsize, this will fire for all memsets of 0. The original code notes that it's not beneficial to do this for small memsets; however, using bzero here will save a mov from wzr. For minsize, I think that it's preferable to prioritise omitting the mov. This also fixes a bug in the libcall legalization code which would delete instructions which could not be legalized. It also adds a check to make sure that we actually get a libcall name. Code size improvements (Darwin): - CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign) - CTMark -Oz: -0.2% geomean (-0.5% on bullet) Differential Revision: https://reviews.llvm.org/D99358	2021-03-25 17:14:25 -07:00
Fangrui Song	ed956554f9	[Triple][Driver] Add muslx32 environment and use /lib/ld-musl-x32.so.1 for -dynamic-linker Differential Revision: https://reviews.llvm.org/D99308	2021-03-25 16:25:47 -07:00
Guozhi Wei	3240910f00	[DAE] Adjust param/arg attributes when changing parameter to undef In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code. To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg. Differential Revision: https://reviews.llvm.org/D98899	2021-03-25 14:53:22 -07:00
Philip Reames	4f5e92cc05	Mark gc.relocate and gc.result as readnone (try 2) As noted in the LangRef, these are semantically readnone projections from the result value of the associated statepoint. However, it turned out we had a few latent bugs being covered up by the fact we were only marking them readonly (see PR49607 for context). As of this change, all known issues are resolved. This is a deliberately minimal patch to make it easy to test downstream and revert with minimal change if that turns out to be necessary. Differential Revision: https://reviews.llvm.org/D98729	2021-03-25 14:50:07 -07:00
Andrew Savonichev	bba25a9cd8	[MCA] Support carry-over instructions for in-order processors Instructions that have more uops than the processor's IssueWidth are issued in multiple cycles. The patch fixes PR49712. Differential Revision: https://reviews.llvm.org/D99339	2021-03-26 00:06:19 +03:00
Nikita Popov	93a636d9f6	[IR] Lift attribute handling for assume bundles into CallBase Rather than special-casing assume in BasicAA getModRefBehavior(), do this one level higher, in the attribute handling of CallBase. For assumes with operand bundles, the inaccessiblememonly attribute applies regardless of operand bundles.	2021-03-25 21:15:39 +01:00
Mircea Trofin	20ad206b60	[NFC] Module::getInstructionCount() is const	2021-03-25 12:29:19 -07:00
Stanislav Mekhanoshin	dc928e9c37	[AMDGPU] Refactoring mfma intrinsic definitions. NFC. Differential Revision: https://reviews.llvm.org/D99366	2021-03-25 12:22:52 -07:00
Jamie Schmeiser	7f2ae3d55f	add print-change diff modes that do not use colour Summary: The colour characters currently added to the output of -print-changed=diff and -print-changed=diff-quiet cause difficulties when capturing the output and examining it in an editor. Change the function to not have the colour characters and add 2 new choices (-print-changed=cdiff and -print-changed=cdiff-quiet) to retain the existing functionality of adding the colour characters. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D97398	2021-03-25 10:35:27 -04:00
Abhina Sreeskantharajan	c83cd8feef	[NFC] Reordering parameters in getFile and getFileOrSTDIN In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used. ``` static ErrorOr<std::unique_ptr<MemoryBuffer>> getFile(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true, bool IsVolatile = false); static ErrorOr<std::unique_ptr<MemoryBuffer>> getFileOrSTDIN(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true); static ErrorOr<std::unique_ptr<MB>> getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset, bool IsText, bool RequiresNullTerminator, bool IsVolatile); static ErrorOr<std::unique_ptr<WritableMemoryBuffer>> getFile(const Twine &Filename, bool IsVolatile = false); ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D99182	2021-03-25 09:47:49 -04:00
Philip Reames	9a82f42d12	Plumb TLI through isSafeToExecuteUnconditionally [NFC] Split from D95815 to reduce patch size. Isn't (yet) used for anything, only the client side is wired up.	2021-03-24 17:52:04 -07:00
Sanjay Patel	adf42dff42	[ValueTracking] peek through min/max to find isKnownToBeAPowerOfTwo This is similar to the select logic just ahead of the new code. Min/max choose exactly one value from the inputs, so if both of those are a power-of-2, then the result must be a power-of-2. This might help with D98152, but we likely still need other pieces of the puzzle to avoid regressions. The change in PatternMatch.h is needed to build with clang. It's possible there is a better way to deal with the 'const' incompatibities. Differential Revision: https://reviews.llvm.org/D99276	2021-03-24 17:54:38 -04:00
Gulfem Savrun Yeniceri	5fbe1fdf17	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `5fd001a5ff` because it broke clang-with-thin-lto-ubuntu bot.	2021-03-24 18:59:33 +00:00
Gulfem Savrun Yeniceri	5fd001a5ff	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-24 17:31:18 +00:00
Thomas Preud'homme	3b52c04e82	Make FindAvailableLoadedValue TBAA aware FindAvailableLoadedValue() relies on FindAvailablePtrLoadStore() to run the alias analysis when searching for an equivalent value. However, FindAvailablePtrLoadStore() calls the alias analysis framework with a memory location for the load constructed from an address and a size, which thus lacks TBAA metadata info. This commit modifies FindAvailablePtrLoadStore() to accept an optional memory location as parameter to allow FindAvailableLoadedValue() to create it based on the load instruction, which would then have TBAA metadata info attached. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99206	2021-03-24 17:20:26 +00:00
Konstantin Zhuravlyov	f4ace63737	AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names Differential Revision: https://reviews.llvm.org/D95638	2021-03-24 11:54:05 -04:00
Sander de Smalen	55d18b3cc2	[TTI] Return a TypeSize from getRegisterBitWidth. This patch changes the interface to take a RegisterKind, to indicate whether the register bitwidth of a scalar register, fixed-width vector register, or scalable vector register must be returned. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98874	2021-03-24 14:45:13 +00:00
Anirudh Prasad	301d9261b7	[AsmParser][SystemZ][z/OS] Re-introduce HLASM comment syntax - https://reviews.llvm.org/rGb605cfb336989705f391d255b7628062d3dfe9c3 was reverted due to sanitizer bugs in the introduced unit-test (specifically in the Address sanitizer https://lab.llvm.org/buildbot/#/builders/5/builds/5697) - This patch attempts to rectify that, as well as re-factor parts of the test - The issue was previously, within the `setupCallToAsmParser` function in the unit-test, `SrcMgr` was declared as a local variable. `SrcMgr` owns a unique pointer. Since the variable goes out of scope at the end of the function, the unique pointer is released. - This patch, moves the declaration of the `SrcMgr` variable to a class field, since the scope will remain until the class's destructor is invoked (which in this case is at the end of the unit test) - Furthermore, this patch also moves the `MCContext Ctx` declaration from a local variable instance inside a function, to a unique pointer class field. This ensures the instantiation of the MCContext remains until the tear down of the test. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D99004	2021-03-24 10:17:00 -04:00
Joseph Huber	8140d0ec4a	[OpenMP] Change OMPIRBuilder to append function attributes Summary: Currently the OMPIRBuilder overwrites the function's existing attributes when it assigns the ones defined in OMPKinds.def. This changes the behaviour to append the current function's attributes with them instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D98740	2021-03-24 09:08:29 -04:00
Andrea Di Biagio	97a00b7b20	[MCA] Fix for uninitialised member in constructor. NFC	2021-03-24 11:21:59 +00:00
Florian Hahn	cd0c00c9fe	[LV] Move exact FP math check out of Requirements. We know if the loop contains FP instructions preventing vectorization after we are done with legality checks. This patch updates the code the check for un-vectorizable FP operations earlier, to avoid unnecessarily running the cost model and picking a vectorization factor. It also makes the code more direct and moves the check to a position where similar checks are done. I might be missing something, but I don't see any reason to handle this check differently to other, similar checks. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98633	2021-03-24 11:01:44 +00:00
Andrew Savonichev	292da93d59	[MCA] Disable RCU for InOrderIssueStage This is a follow-up for: D98604 [MCA] Ensure that writes occur in-order When instructions are aligned by the order of writes, they retire in-order naturally. There is no need for an RCU, so it is disabled. Differential Revision: https://reviews.llvm.org/D98628	2021-03-24 13:54:04 +03:00
Andy Wingo	c9801db2eb	[WebAssembly][MC] Record limit constraints for table sizes This commit adds a full WasmTableType to MCSymbolWasm, differing from the current situation (just an ElemType) in that it additionally records a WasmLimits. We add support for specifying the limits in .S files also, via the following syntax variations: .tabletype SYM, ELEMTYPE .tabletype SYM, ELEMTYPE, MINSIZE .tabletype SYM, ELEMTYPE, MINSIZE, MAXSIZE Depends on D99186. Differential Revision: https://reviews.llvm.org/D99191	2021-03-24 09:44:22 +01:00
Andy Wingo	9ac5620cb8	[WebAssembly] Rename WasmLimits::Initial to ::Minimum. NFC. This patch renames the "Initial" member of WasmLimits to the name used in the spec, "Minimum". In the core WebAssembly specification, the Limits data type has one required "min" member and one optional "max" member, indicating the minimum required size of the corresponding table or memory, and the maximum size, if any. Although the WebAssembly spec does instantiate locally-defined tables and memories with the initial size being equal to the minimum size, it can't impose such a requirement for imports. It doesn't make sense to require an initial size for a memory import, for example. The compiler can only sensibly express the minimum and maximum sizes. See https://github.com/WebAssembly/js-types/blob/master/proposals/js-types/Overview.md#naming-of-size-limits for a related discussion that agrees that the right name of "initial" is "minimum" when querying the type of a table or memory from JavaScript. (Of course it still makes sense for JS to speak in terms of an initial size when it explicitly instantiates memories and tables.) Differential Revision: https://reviews.llvm.org/D99186	2021-03-24 09:10:11 +01:00
Alex Orlov	876435c487	* Fix demangling of optional template-args for vendor extended type qualifier. This fixes https://bugs.llvm.org/show_bug.cgi?id=48009 bug. Reviewed By: erik.pilkington, krisb Differential Revision: https://reviews.llvm.org/D98687	2021-03-24 10:21:32 +04:00
Chuanqi Xu	3b83590cb2	[NFC] [Support] Fix unconsistent comment with codes for ExtendSigned	2021-03-24 13:58:54 +08:00
Max Kazantsev	85cbfe75af	[NFC] Fix comment describing what EdgeBundles is The original comment says the same thing twice, and does not mention that edges entering the block are also in the same bundle (which seems true from what the underlying code is doing). Differential Revision: https://reviews.llvm.org/D99144 Reviewed By: RKSimon	2021-03-24 11:04:05 +07:00
Serguei Katkov	311d81ce97	[RegAlloc] Fix "ran out of regs" with uses in statepoint Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live intervals in the way that all physical registers are occupied by "zero-length" live intervals which are marked as not-spillable. While intervals are marked as not-spillable in the moment of creation when they are really zero-length it is possible that in future as part of re-materialization there will need for physical register between def and use of such tiny interval (the use is not related to this interval at all). As all physical registers are assigned to not-spillable intervals there is not avaialbe registers and RA reports an error. The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint instruction in var args section. Such interval may be perfectly spilled and folded to operand of statepoint. Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98766	2021-03-24 10:25:34 +07:00
Choongwoo Han	772e1dd1dd	[Coverage] Load records immediately The current implementation keeps buffers generated for each object file until it completes loading of all files. This approach requires a lot of memory if there are a lot of huge object files. Thus, make it to load coverage records immediately rather than waiting for other binaries to be loaded. This reduces memory usage of llvm-cov from >128GB to 5GB when loading Chromium binaries in Windows. Additional testing: check-profile, check-llvm Differential Revision: https://reviews.llvm.org/D99110	2021-03-23 16:25:20 -07:00
Rafael Auler	53196387c2	Add register size info back to MCRegisterClass This patch addresses the removal of register size information done in commit `c8b782c`. Without this change, there is no viable option to get register size information outside libTarget. We need this information to run analysis that know the register size from the MC layer, used by BOLT. Discussion D50285 and D47199. Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D97891	2021-03-23 15:04:44 -07:00
Alexey Bataev	99203f2004	[Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967	2021-03-23 14:25:36 -07:00
Alexey Bataev	f1b47ad278	Revert "[Analysis]Add getPointersDiff function to improve compile time." This reverts commit `065a14a12d` to investigate and fix crash in SLP vectorizer.	2021-03-23 13:17:54 -07:00
Alexey Bataev	065a14a12d	[Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967	2021-03-23 12:58:42 -07:00
Tony	c181724a9b	[NFC][AMDGPU] Reserve AMD GPU ELF machine number 0x41 Reviewed By: foad Differential Revision: https://reviews.llvm.org/D99196	2021-03-23 17:53:02 +00:00
Nathan James	a0f48d57a9	[NFC] Enable RVALUE_REFERENCE_THIS on MSVC 2019 In https://reviews.llvm.org/D72948 This was enabled for all MSVC but reverted as it was determined not to work on some 2017 versions. The issue is assumed to be fixed on 2019 so enable for 2019 and newer. Some testing could be done to determine which version of MSVC 2017 support this feature but its safer right now to leave it at 2019. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D98809	2021-03-23 16:40:13 +00:00
Andrea Di Biagio	f5bdc88e4d	[MCA] Improved handling of negative read-advance cycles. Before this patch, register writes were always invalidated by the RegisterFile at instruction commit stage. So, the RegisterFile was often losing the knowledge about the `execute cycle` of writes already committed. While this was not problematic for non-delayed reads, this was sometimes leading to inaccurate read latency computations in the presence of negative read-advance cycles. This patch fixes the issue by changing how the RegisterFile component internally keeps track of the `execute cycle` information of each write. On every instruction executed, the RegisterFile gets notified by the RetireStage, so that it can internally record the execute cycle of each executed write. The `execute cycle` information is stored within WriteRef itself, and it is not invalidated when the write is committed.	2021-03-23 14:47:23 +00:00
Jamie Schmeiser	64336d3421	Revert "A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash." This reverts commit `9544a32287`.	2021-03-23 10:09:27 -04:00
Jamie Schmeiser	9544a32287	A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash. Summary: The IR is saved in its print form before each pass is started and a signal handler is registered. If the compilation crashes, the signal handler will print the saved IR to dbgs(). This option can be modified using -print-module-scope to get the IR for the complete module. Note that this option only works with the new pass manager. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D86657	2021-03-23 09:29:17 -04:00
serge-sans-paille	e19884cd74	Introduce a generic operator to apply complex operations to BitVector This avoids temporary and memcpy call when computing large expressions. It's basically some kind of poor man's expression template, but it seems easier to maintain to have a single generic `apply` call instead of the whole expression template machinery here. Differential Revision: https://reviews.llvm.org/D98176	2021-03-23 14:23:26 +01:00
Yvan Roux	241032a205	[llvm-symbolizer][llvm-nm] Fix AArch64 and ARM mapping symbols handling. Exclude AArch64 mapping symbols ($x and $d) for symtab symbolization as it was done for ARM since D95916 tom bring bots back to green state. This is implemented by setting SF_FormatSpecific such that llvm-symbolizer will ignore them, and use this flag to re-implement llvm-nm --special-syms option which make it work for both targets. Differential Revision: https://reviews.llvm.org/D98803	2021-03-23 14:17:12 +01:00
Valentin Clement	d709dcc090	[openacc][openmp] Reduce number of generated file and prefer inclusion of .inc Follow up from D92955 and D83636. This patch makes the base cpp files OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file generated by tablegen. This reduces the number of file generated by the DirectiveEmitter backend and makes it closer to the proposal in D83636. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D93560	2021-03-23 09:16:53 -04:00
Matt Arsenault	b24436ac96	GlobalISel: Lower funnel shifts	2021-03-23 09:11:17 -04:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Pushpinder Singh	d0e5422eb8	[GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO Reviewed By: foad Differential Revision: https://reviews.llvm.org/D93963	2021-03-23 05:45:43 +00:00
Rahman Lavaee	949abf7d6a	[llvm-readelf, propeller] Add fallthrough bit to basic block metadata in BB-Address-Map section. This patch adds a fallthrough bit to basic block metadata, indicating whether the basic block can fallthrough without taking any branches. The bit will help us avoid an intel LBR bug which results in occasional duplicate entries at the beginning of the LBR stack. This patch uses `MachineBasicBlock::canFallThrough()` to set the bit. This is not a const method because it eventually calls `TargetInstrInfo::analyzeBranch`, but it calls this function with the default `AllowModify=false`. So we can either make the argument to the `getBBAddrMapMetadata` non-const, or we can use `const_cast` when calling `canFallThrough`. I decide to go with the latter since this is purely due to legacy code, and in general we should not allow the BasicBlock to be mutable during `getBBAddrMapMetadata`. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D96918	2021-03-22 21:38:05 -07:00
Tony	1e04706adb	[AMDGPU] Reserve ELF code Reserve AMD GPU ELF machine code 0x040. Minor AMDGPUUsage format consistency change. Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D99122	2021-03-23 04:30:38 +00:00
Gulfem Savrun Yeniceri	e3a6d70c68	Revert "[Passes] Add relative lookup table converter pass" This reverts commit `78a65cd945` which caused buildbot failures.	2021-03-23 00:43:16 +00:00
Juneyoung Lee	5c2e50b5d2	Reland "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe" This relands commit `99108c791d` (D95026) which was reverted by `8d5a981a13` because the underlying problem (https://llvm.org/pr49495) is fixed.	2021-03-23 09:19:53 +09:00
Gulfem Savrun Yeniceri	78a65cd945	[Passes] Add relative lookup table converter pass Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in: https://bugs.llvm.org/show_bug.cgi?id=45244 This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly. Differential Revision: https://reviews.llvm.org/D94355	2021-03-22 22:09:02 +00:00
Roman Lebedev	d37fe26a2b	[NFC][IR] Type: add getWithNewType() method Sometimes you want to get a type with same vector element count as the current type, but different element type, but there's no QOL wrapper to do that. Add one.	2021-03-23 00:50:58 +03:00
Nikita Popov	7e18cd887c	[InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced This is an alternative to D98391/D98585, playing things more conservatively. If AllowRefinement == false, then we don't use InstSimplify methods at all, and instead explicitly implement a small number of non-refining folds. Most cases are handled by constant folding, and I only had to add three folds to cover our unit tests / test-suite. While this may lose some optimization power, I think it is safer to approach from this direction, given how many issues this code has already caused. Differential Revision: https://reviews.llvm.org/D99027	2021-03-22 22:12:56 +01:00
Nikita Popov	ca28e32359	[IR] Mark assume/annotation as InaccessibleMemOnly These intrinsics don't need to be marked as arbitrary writing, it's sufficient to write inaccessible memory (aka "side effect") to preserve control dependencies. This means less special-casing in BasicAA. This is intended as an alternative to D98925. Differential Revision: https://reviews.llvm.org/D99022	2021-03-22 22:01:03 +01:00
Sanjay Patel	664d0c052c	[TargetTransformInfo] move branch probability query from TargetLoweringInfo This is no-functional-change intended (NFC), but needed to allow optimizer passes to use the API. See D98898 for a proposed usage by SimplifyCFG. I'm simplifying the code by removing the cl::opt. That was added back with the original commit in D19488, but I don't see any evidence in regression tests that it was used. Target-specific overrides can use the usual patterns to adjust as necessary. We could also restore that cl::opt, but it was not clear to me exactly how to do it in the convoluted TTI class structure.	2021-03-22 15:55:34 -04:00
Matt Arsenault	9fdfd8dd52	GlobalISel: Add utility function to constant fold FP ops	2021-03-22 14:38:17 -04:00
Lang Hames	cc4ad2c540	[JITLink][ELF/x86-64] Add support for GOTOFF64 relocation.	2021-03-22 10:40:50 -07:00
Stefan Gränitz	50e499a56d	[Orc] Fix copy elision warning in RPCUtils The `callB()` template function always moved errors on return, because in the majority of cases its return type is an `Expected<T>` and the error must be moved into the implicit ctor. For the special case of a `void` result, however, the `ResultTraits` class is specialized and the return type is a raw `Error`. Some build bots complain, that in favor of NRVO errors should not be moved in this case. ``` llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1513:27: llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1519:27: llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1526:29: warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move] ``` The warning is reasonable from a type-system point of view. For performance it's entirely insignificant. Differential Revision: https://reviews.llvm.org/D98947	2021-03-22 17:47:33 +01:00
Stefan Gränitz	c154cddabd	[Orc] Fix tracking of pending debug objects in DebugObjectManagerPlugin There can be multiple MaterializationResponsibilitys in-flight for a single ResourceKey. Hence, pending debug objects must be tracked by MaterializationResponsibility and not by ResourceKey. Differential Revision: https://reviews.llvm.org/D98785	2021-03-22 17:47:32 +01:00
Philip Reames	d4648eeaa2	[SCEV] Use trip count information to improve shift recurrence ranges This patch exploits the knowledge that we may be running many fewer than bitwidth iterations of the loop, and may be able to disallow the overflow case. This patch specifically implements only the shl case, but this can be generalized to ashr and lshr without difficulty. Differential Revision: https://reviews.llvm.org/D98222	2021-03-22 09:38:43 -07:00
Philip Reames	9c16621c0d	Clarify comments on recurrence matcher [NFC] Triggered by discussion on D98222. The case where we have a loop variant step is suprising, and doesn't match the behavior of SCEV's recurrences. As such, make sure we call that out explicitly.	2021-03-22 09:23:06 -07:00
Wenlei He	ce6bfe9411	[CSSPGO][llvm-profgen] Use profile summary based threshold for context trimming and merging Switch to use cold threshold from profile summary for cold context merging and trimming, instead of relying on hard coded values. Minor refactoring included for switch names, etc. Differential Revision: https://reviews.llvm.org/D98921	2021-03-22 08:56:59 -07:00
Alexey Lapshin	972b6a3a34	[llvm-objcopy][Support] move writeToOutput helper function to Support. writeToOutput function is useful when it is necessary to create different kinds of streams(based on stream name) and when we need to use a temporary file while writing(which would be renamed into the resulting file in a success case). This patch moves the writeToStream helper into the Support library. Differential Revision: https://reviews.llvm.org/D98426	2021-03-22 15:41:10 +03:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Lang Hames	fc36a511c6	[JITLink][ELF/x86-64] Add support for R_X86_64_GOTPC64 and R_X86_64_GOT64. Start adding support for ELF x86-64 large code model, PIC relocations.	2021-03-21 21:52:54 -07:00
Lang Hames	0a74ec3299	[JITLink] Start laying the groundwork for ELF x86-64 large code model support. Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template for a JITLink pass that transforms external symbols meeting a user-supplied predicate into defined symbols pointing at the start and end of a Section identified by the predicate. JITLink.h is updated with a new makeAbsolute function to support this pass. Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new name better describes the intent of this GOT and PLT stubs builder, and will help to distinguish it from future GOT and PLT stub builders that build entries that may be shared between multiple graphs.	2021-03-21 20:56:47 -07:00
Lang Hames	209ceed745	[JITLink][ELF/x86-64] Add Delta32, NegDelta32, NegDelta64 support. These were missing, but are used in eh-frame section support.	2021-03-21 20:15:40 -07:00
Roman Lebedev	e3a4701627	[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights `08196e0b2e` exposed LowerExpectIntrinsic's internal implementation detail in the form of LikelyBranchWeight/UnlikelyBranchWeight options to the outside. While this isn't incorrect from the results viewpoint, this is suboptimal from the layering viewpoint, and causes confusion - should transforms also use those weights, or should they use something else, D98898? So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight internal again, and fixing all the code that used it directly, which currently is only clang codegen, thankfully, to emit proper @llvm.expect intrinsics instead.	2021-03-21 22:50:21 +03:00
Roman Lebedev	37d6be9052	Revert "[BranchProbability] move options for 'likely' and 'unlikely'" Upon reviewing D98898 i've come to realization that these are implementation detail of LowerExpectIntrinsicPass, and they should not be exposed to outside of it. This reverts commit `ee8b53815d`.	2021-03-21 22:50:21 +03:00
Matt Arsenault	20a24af01d	MIR: Fix missing serialization for HasTailCall	2021-03-21 13:14:04 -04:00
Sanjay Patel	ee8b53815d	[BranchProbability] move options for 'likely' and 'unlikely' This makes the settings available for use in other passes by housing them within the Support lib, but NFC otherwise. See D98898 for the proposed usage in SimplifyCFG (where this change was originally included). Differential Revision: https://reviews.llvm.org/D98945	2021-03-20 14:46:46 -04:00
Jeroen Dobbelaere	77080a1eb6	Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types. Now that intrinsic name mangling can cope with unnamed types, the custom name mangling in PredicateInfo (introduced by D49126) can be removed. (See D91250, D48541) Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91661	2021-03-20 11:37:09 +01:00
Shao-Ce Sun	4d11baab25	[NFC][ValueTypes] Align code by column Adjusted some whitespaces. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98975	2021-03-20 13:43:07 +08:00
Lang Hames	f380066461	[JITLink] Remove redundant local variable definitions from a unit test.	2021-03-19 18:29:36 -07:00
Ellis Hoag	d90270e9e8	Port D97640 to llvm/include/llvm/ProfileData/InstrProfData.inc Differential Revision: https://reviews.llvm.org/D98982	2021-03-19 16:24:16 -07:00
Christoffer Lernö	528f6f7d61	Add type attributes to LLVM C API The LLVM C API is missing type attributes as is needed by attributes such as sret and byval. This patch adds three missing wrapper functions. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=48249 https://reviews.llvm.org/D97763	2021-03-19 19:07:04 -04:00
Jessica Paquette	4773dd5ba9	[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes) There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG. E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain code that matches a bitfield extract from an and + right shift. Rather than duplicating code in the same way, this adds two opcodes: - G_UBFX (unsigned bitfield extract) - G_SBFX (signed bitfield extract) They work like this ``` %x = G_UBFX %y, %lsb, %width ``` Where `lsb` and `width` are - The least-significant bit of the extraction - The width of the extraction This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends the result, while G_SBFX sign-extends the result. This should allow us to use the combiner to match the bitfield extraction patterns rather than duplicating pattern-matching code in each target. Differential Revision: https://reviews.llvm.org/D98464	2021-03-19 14:37:19 -07:00
Fangrui Song	948be862d6	[llvm-readobj] Remove legacy GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} and dump new GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} https://sourceware.org/bugzilla/show_bug.cgi?id=26703 deprecated the previous GNU_PROPERTY_X86_ISA_1_{CMOV,SSE,*} values (renamed to `COMPAT`) and added new values. Since the legacy values are not used by compilers, having dumping support in llvm-readobj is unnecessary. So just drop the legacy feature. The new values are used by GCC 11 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250) `-march=x86-64-v[234]` to indicate the micro-architecture ISA levels. Differential Revision: https://reviews.llvm.org/D98818	2021-03-19 14:35:22 -07:00
Philip Reames	5698537f81	Update basic deref API to account for possiblity of free [NFC] This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree". The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet. Differential Revision: https://reviews.llvm.org/D98908	2021-03-19 11:17:19 -07:00
Alexey Bataev	14ae0cf0f5	[Cost]Canonicalize the cost for logical or/and reductions. The generic cost of logical or/and reductions should be cost of bitcast <ReduxWidth x i1> to iReduxWidth + cmp eq\|ne iReduxWidth. Differential Revision: https://reviews.llvm.org/D97961	2021-03-19 11:01:58 -07:00
Paul C. Anagnostopoulos	a9fc44c557	[TableGen] Improve handling of template arguments This requires changes to TableGen files and some C++ files due to incompatible multiclass template arguments that slipped through before the improved handling.	2021-03-19 09:57:53 -04:00
Jeroen Dobbelaere	04790d9cfb	Support intrinsic overloading on unnamed types This patch adds support for intrinsic overloading on unnamed types. This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484). The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types. This can result in identical intrinsic mangled names for different function prototypes. This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name. Implementation details: - The mapping is created on demand and kept in Module. - It also checks for existing clashes and recycles potentially existing prototypes and declarations. - Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument. - I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name). -- Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is -- The current situation already has a limitation. So that should not get worse with this patch. - Intrinsic::getDeclaration and the verifier are now using the new version. Other notes: - As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct - The initial count starts from 0 for each intrinsic mangled name. - In case of name clashes, existing prototypes are remembered and reused when that makes sense. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91250	2021-03-19 14:34:25 +01:00
Abhina Sreeskantharajan	4f750f6ebc	[SystemZ][z/OS] Distinguish between text and binary files on z/OS This patch consists of the initial changes to help distinguish between text and binary content correctly on z/OS. I would like to get feedback from Windows users on setting OF_None for all ToolOutputFiles. This seems to have been done as an optimization to prevent CRLF translation on Windows in the past. Reviewed By: zibi Differential Revision: https://reviews.llvm.org/D97785	2021-03-19 08:09:57 -04:00
Simon Pilgrim	a96897219d	[KnownBits] Add knownbits analysis for mulhs/mulu 'multiply high' instructions Split off from D98857 https://reviews.llvm.org/D98866	2021-03-19 08:56:06 +00:00
Wenlei He	1410db70b9	[CSSPGO] Add attribute metadata for context profile This changes adds attribute field for metadata of context profile. Currently we have an inline attribute that indicates whether the leaf frame corresponding to a context profile was inlined in previous build. This will be used to help estimating inlining and be taken into account when trimming context. Changes for that in llvm-profgen will follow. It will also help tuning. Differential Revision: https://reviews.llvm.org/D98823	2021-03-18 22:00:56 -07:00
Philip Reames	fa26da0582	Add a couple of missing attribute query methods [NFC]	2021-03-18 17:33:20 -07:00
Yuanfang Chen	b4a8c0ebb6	[LTO][MC] Discard non-prevailing defined symbols in module-level assembly This is the alternative approach to D96931. In LTO, for each module with inlineasm block, prepend directive ".lto_discard <sym>, <sym>*" to the beginning of the inline asm. ".lto_discard" is both a module inlineasm block marker and (optionally) provides a list of symbols to be discarded. In MC while emitting for inlineasm, discard symbol binding & symbol definitions according to ".lto_disard". Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98762	2021-03-18 15:33:42 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Mike Rice	c2f8e158f5	[OPENMP51]Support for the 'destroy' clause with interop variable. Added basic parsing/sema/serialization support to extend the existing 'destroy' clause for use with the 'interop' directive. Differential Revision: https://reviews.llvm.org/D98834	2021-03-18 09:12:56 -07:00
Andrew Savonichev	e6ce0db378	[MCA] Ensure that writes occur in-order Delay the issue of a new instruction if that leads to out-of-order commits of writes. This patch fixes the problem described in: https://bugs.llvm.org/show_bug.cgi?id=41796#c3 Differential Revision: https://reviews.llvm.org/D98604	2021-03-18 17:10:20 +03:00
Matt Arsenault	b9a0384983	GlobalISel: Preserve source value information for outgoing byval args Pass through the original argument IR value in order to preserve the aliasing information in the memcpy memory operands.	2021-03-18 09:16:54 -04:00
Matt Arsenault	61f834cc09	GlobalISel: Insert memcpy for outgoing byval arguments byval requires an implicit copy between the caller and callee such that the callee may write into the stack area without it modifying the value in the parent. Previously, this was passing through the raw pointer value which would break if the callee wrote into it. Most of the time, this copy can be optimized out (however we don't have the optimization SelectionDAG does yet). This will trigger more fallbacks for AMDGPU now, since we don't have legalization for memcpy yet (although we should stop using byval anyway).	2021-03-18 09:16:54 -04:00
Max Kazantsev	b3a1500ea8	[SCEV][NFC] API for predicate evaluation Provides API that allows to check predicate for being true or false with one call. Current implementation is naive and just calls isKnownPredicate twice, but further we can rework this logic trying to use one check to prove both facts.	2021-03-18 19:21:29 +07:00
Sanjay Patel	c8893f3b78	[LoopVectorize] relax FMF constraint for FP induction This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are some that clearly are not in play like arcp or afn. If we want to make FMF constraints consistent across the IR optimizer, we might want to add nsz too, but that's up for debate (users can't expect associative FP math and preservation of sign-of-zero at the same time?). The calling code was fixed to avoid miscompiles with: `1bee549737` Differential Revision: https://reviews.llvm.org/D98708	2021-03-18 08:11:22 -04:00
Lang Hames	0604e0bc68	[JITLink] Reformat an enum.	2021-03-17 21:43:53 -07:00
Lang Hames	86ec3fd9d9	[JITLink] Improve out-of-range error messages. Switches all backends to use the makeTargetOutOfRangeError function from JITLink.h.	2021-03-17 21:35:24 -07:00
Chen Zheng	d33b016ada	[XCOFF][llvm-dwarfdump] llvm-dwarfdump support for XCOFF Author: hubert.reinterpretcast, shchenz Reviewed By: jasonliu, echristo Differential Revision: https://reviews.llvm.org/D97186	2021-03-17 21:21:51 -04:00
Joel E. Denny	dd59c1324d	[FileCheck] Fix numeric error propagation A more general name might be match-time error propagation. That is, it's conceivable we'll one day have non-numeric errors that require the handling fixed by this patch. Without this patch, FileCheck behaves as follows: ``` $ cat check CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] $ FileCheck -vv -dump-input=never check < input check:1:54: remark: implicit EOF: expected string found in input CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] ^ <stdin>:2:1: note: found here ^ check:1:15: error: unable to substitute variable or numeric expression: overflow error CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]] ^ $ echo $? 0 ``` Notice that the exit status is 0 even though there's an error. Moreover, FileCheck doesn't print the error diagnostic unless both `-dump-input=never` and `-vv` are specified. The same problem occurs when `CHECK-NOT` does have a match but a capture fails due to overflow: exit status is 0, and no diagnostic is printed unless both `-dump-input=never` and `-vv` are specified. The usefulness of capturing from `CHECK-NOT` is questionable, but this case should certainly produce an error. With this patch, FileCheck always includes the error diagnostic and has non-zero exit status for the above examples. It's conceivable that this change will cause some existing tests to fail, but my assumption is that they should fail. Moreover, with nearly every project enabled, this patch didn't produce additional `check-all` failures for me. This patch also extends input dumps to include such numeric error diagnostics for both expected and excluded patterns. As noted in fixmes in some of the tests added by this patch, this patch worsens an existing issue with redundant diagnostics. I'll fix that bug in a subsequent patch. Reviewed By: thopre, jhenderson Differential Revision: https://reviews.llvm.org/D98086	2021-03-17 19:25:41 -04:00
Mike Rice	c615927c8e	[OPENMP51]Initial support for the use clause. Added basic parsing/sema/serialization support for the 'use' clause. Differential Revision: https://reviews.llvm.org/D98815	2021-03-17 15:46:14 -07:00
Amara Emerson	d7fed7b899	[AArch64][GlobalISel] Fall back if disabling neon/fp in the translator. The previous technique relied on early-exiting the legalizer predicate initialization, leaving an empty rule table. That causes a fallback for most instructions, but some have legacy rules defined like G_ZEXT which can try continue, but then crash. We should fall back earlier, in the translator, to avoid this issue. Differential Revision: https://reviews.llvm.org/D98730	2021-03-17 15:08:08 -07:00
Philip Reames	31764ea295	[LCSSA] Extract a utility for deciding if a new use requires a new lcssa phi [NFC] (Triggered by a review comment on D98728, but otherwise unrelated.)	2021-03-17 12:14:01 -07:00
David Green	e2935dcfc4	[TTI] Add a Mask to getShuffleCost This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask can be provided a more accurate cost can be provided by the backend. For example VREV costs could be returned by the ARM backend. This should be an NFC until then, laying the groundwork for that to be added. Differential Revision: https://reviews.llvm.org/D98206	2021-03-17 17:46:26 +00:00
Stephen Tozer	3bfddc2593	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit `01ac6d1587`.	2021-03-17 16:45:25 +00:00
Mike Rice	410f09af09	[OPENMP51]Initial support for the interop directive. Added basic parsing/sema/serialization support for interop directive. Support for the 'init' clause. Differential Revision: https://reviews.llvm.org/D98558	2021-03-17 09:42:07 -07:00
Simon Pilgrim	cfc256ba9f	[DAG] TargetLowering::isBinOp() - add ISD::SSUBSAT/USUBSAT Add to the generic non-commutative binop list.	2021-03-17 14:51:00 +00:00
Alexey Lapshin	021de7cf80	[llvm-objcopy][NFC] Move ownership keeping code into restoreStatOnFile(). The D93881 added functionality which preserve ownership for output file if llvm-objcopy is called under root. That code was added into the place where output file is created. The llvm-objcopy already has a function which sets/restores rights/permissions for the output file. That is the restoreStatOnFile() function. This patch moves code (preserving ownershipping) into the restoreStatOnFile() function. Differential Revision: https://reviews.llvm.org/D98511	2021-03-17 17:27:00 +03:00
Hans Wennborg	01ac6d1587	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit `df69c69427`.	2021-03-17 13:36:48 +01:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Fangrui Song	5bd6b0a62b	[MC] Delete unused MCOperand::{create,is,get}FPImm	2021-03-17 00:30:38 -07:00
Max Kazantsev	a6074b092c	[BasicAA] Drop dependency on Loop Info. PR43276 BasicAA stores a reference to LoopInfo inside. This imposes an implicit requirement of keeping it up to date whenever we modify the IR (in particular, whenever we modify terminators of blocks that belong to loops). Failing to do so leads to incorrect state of the LoopInfo. Because general AA does not require loop info updates and provides to API to update it properly, the users of AA reasonably assume that there is no need to update the loop info. It may be a reason of bugs, as example in PR43276 shows. This patch drops dependence of BasicAA on LoopInfo to avoid this problem. This may potentially pessimize the result of queries to BasicAA. Differential Revision: https://reviews.llvm.org/D98627 Reviewed By: nikic	2021-03-17 11:43:44 +07:00
Anirudh Prasad	9f5da80013	Revert "[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax"" This reverts commit `b605cfb336`. Differential Revision: https://reviews.llvm.org/D98744	2021-03-16 18:39:04 -04:00
Fangrui Song	5d037458a3	[RISCV] Make empty name symbols SF_FormatSpecific so that llvm-symbolizer ignores them for symbolization On RISC-V, clang emits empty name symbols used for label differences. (In GCC the symbols are typically `.L0`) After D95916, the empty name symbols can show up in llvm-symbolizer's symbolization output. They have no names and thus not useful. Set `SF_FormatSpecific` so that llvm-symbolizer will ignore them. `SF_FormatSpecific` is also used in LTO but that case should not matter. Corresponding addr2line problem: https://sourceware.org/bugzilla/show_bug.cgi?id=27585 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98669	2021-03-16 14:12:18 -07:00
Anirudh Prasad	b605cfb336	[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax" - Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 \| reverted ]] as it broke when building the unit tests when shared libs on. - This patch reverts the "revert" and makes two minor changes - The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off - The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests" Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D98666	2021-03-16 17:11:46 -04:00
Nikita Popov	40bc309911	Revert "[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration" This reverts commit `d40b4911bd`. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=0aa637b2037d882ddf7861284169abf63f524677&to=d40b4911bd9aca0573752e065f29ddd9aff280e1&stat=instructions	2021-03-16 20:41:26 +01:00
Mircea Trofin	d40b4911bd	[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs() needs to be called before iterating the interfering live ranges. The rest of the patch offers support that is the case: instead of clearing the query's InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix is invalidated (existing triggering mechanism). Without the change in RegAllocGreedy.cpp, the compiler ices. This patch should make it more easily discoverable by developers that collectInterferringVregs needs to be called before iterating. I will follow up with a subsequent patch to improve the usability and maintainability of Query. Differential Revision: https://reviews.llvm.org/D98232	2021-03-16 12:10:10 -07:00
Nick Lewycky	b743bbc505	Add ConstantDataVector::getRaw() to create a constant data vector from raw data. This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types. Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module. In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat. Differential Revision: https://reviews.llvm.org/D98302	2021-03-16 11:57:53 -07:00
Aaron Puchert	1cb15b10ea	Correct Doxygen syntax for inline code There is no syntax like {@code ...} in Doxygen, @code is a block command that ends with @endcode, and generally these are not enclosed in braces. The correct syntax for inline code snippets is @c <code>. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D98665	2021-03-16 15:17:45 +01:00
serge-sans-paille	b2e78a061c	[NFC] Use SmallString instead of std::string for the AttrBuilder This avoids a few unnecessary conversion from StringRef to std::string, and a bunch of extra allocation thanks to the SmallString. Differential Revision: https://reviews.llvm.org/D98190	2021-03-16 13:34:14 +01:00
Caroline Concatto	3c03635d53	[SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse This patch adds support for reverse loop vectorization. It is possible to vectorize the following loop: ``` for (int i = n-1; i >= 0; --i) a[i] = b[i] + 1.0; ``` with fixed or scalable vector. The loop-vectorizer will use 'reverse' on the loads/stores to make sure the lanes themselves are also handled in the right order. This patch adds support for scalable vector on IRBuilder interface to create a reverse vector. The IR function CreateVectorReverse lowers to experimental.vector.reverse for scalable vector and keedp the original behavior for fixed vector using shuffle reverse. Differential Revision: https://reviews.llvm.org/D95363	2021-03-16 07:51:59 +00:00
Bing1 Yu	4f198b0c27	[X86] Pass to transform amx intrinsics to scalar operation. This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D93594	2021-03-16 10:40:22 +08:00
Lang Hames	ecf6466f01	[JITLink][MachO][x86-64] Introduce generic x86-64 support. This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64 backend to use these edge kinds. This simplifies the implementation of the MachO/x86-64 backend and makes it possible to write generic x86-64 passes and utilities. The new edge kinds are different from the original set used in the MachO/x86-64 backend. Several edge kinds that were not meaningfully distinguished in that backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in the new scheme (these edge kinds can be reintroduced later if we find a use for them). At the same time, new edge kinds have been introduced to convey extra information about the state of the graph. E.g. The RequestAndTransformTo* edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP entries, and the 'Relaxable' suffix distinguishes edges that are candidates for optimization from edges which should be left as-is (e.g. to enable runtime redirection). ELF/x86-64 will be refactored to use these generic edges at some point in the future, and I anticipate a similar refactor to create a generic arm64 support header too. Differential Revision: https://reviews.llvm.org/D98305	2021-03-15 15:43:07 -07:00
Nick Lewycky	483a253ae9	NFC: Formatting changes. Run clang-format over these files. Capitalize some variable names per clang-tidy's request. Pulled out to simplify review of D98302.	2021-03-15 14:26:39 -07:00
Florian Hahn	bb244ea2a8	[AnnotationRemarks] Remove unneeded Function.h include (NFC).	2021-03-15 21:09:35 +00:00
Wenlei He	a5d30421a6	[CSSPGO] Load context profile for external functions in PreLink and populate ThinLTO import list For ThinLTO's prelink compilation, we need to put external inline candidates into an import list attached to function's entry count metadata. This enables ThinLink to treat such cross module callee as hot in summary index, and later helps postlink to import them for profile guided cross module inlining. For AutoFDO, the import list is retrieved by traversing the nested inlinee functions. For CSSPGO, since profile is flatterned, a few things need to happen for it to work: - When loading input profile in extended binary format, we need to load all child context profile whose parent is in current module, so context trie for current module includes potential cross module inlinee. - In order to make the above happen, we need to know whether input profile is CSSPGO profile before start reading function profile, hence a flag for profile summary section is added. - When searching for cross module inline candidate, we need to walk through the context trie instead of nested inlinee profile (callsite sample of AutoFDO profile). - Now that we have more accurate counts with CSSPGO, we swtiched to use entry count instead of total count to decided if an external callee is potentially beneficial to inline. This make it consistent with how we determine whether call tagert is potential inline candidate. Differential Revision: https://reviews.llvm.org/D98590	2021-03-15 12:22:15 -07:00
Fangrui Song	5d44c92bf8	Change void getNoop(MCInst &NopInst) to MCInst getNop() Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.	2021-03-15 12:05:34 -07:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Zahira Ammarguellat	80ca4fd154	[NFC] Fix "unused parameter" error revealed in the Linux self-build.	2021-03-15 12:17:11 -04:00
Jan Kratochvil	a28facba1c	[llvm] [dwarf] Fix DWARFListTableHeader::getOffsetEntry off-by-one D98289 was erroneously reporting `invalid range list offset 0x20110` instead of `invalid range list table index 0`. Differential Revision: https://reviews.llvm.org/D98589	2021-03-14 21:42:44 +01:00
Matt Arsenault	d57d8f364f	CodeGen: Reorder MachinePointerInfo fields This saves a little bit of padding.	2021-03-14 10:06:39 -04:00
kuterd	44c1425c17	[Attributor][fix] Remove problematic EXPENSIVE_CHECK Remove the check that is causing compilation issues in some build configurations.	2021-03-13 18:03:24 +03:00
Lang Hames	4e30b20bdb	[JITLink][ORC] Make the LinkGraph available to modifyPassConfig. This makes the target triple, graph name, and full graph content available when making decisions about how to populate the linker pass pipeline. Also updates the LLJITWithObjectLinkingLayerPlugin example to show more API use, including use of the API changes in this patch.	2021-03-12 18:42:51 -08:00
Hubert Tong	4f9cc1512d	Revert "[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax" This reverts commit `bcdd40f802`. See https://reviews.llvm.org/D98543.	2021-03-12 14:48:00 -05:00
Zahira Ammarguellat	c2006f857d	[NFC] Fix "unused parameter" error revealed in the Linux self-build.	2021-03-12 10:26:40 -08:00
Anirudh Prasad	bcdd40f802	[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax - This patch adds in support for the ordinary HLASM comment syntax asm statements (Reference - Chapter 7, Comment Statements, Ordinary Comment Statements) - In brief, the ordinary comment syntax if used, must begin with the "" character - To achieve this, this patch makes use of the CommentString attribute provided in the base MCAsmInfo class - In the SystemZMCAsmInfo class, the CommentString attribute was set to "" based on the assembler dialect - Furthermore, a new attribute RestrictCommentString, is provided to only treat a string as a comment if it appears at the start of the asm statement. Example: "jo -4" is valid in HLASM (jump back 4 bytes from current point - similar to jo -4 in gnu asm) and we don't want "-4" to be treated as a comment. - RFC for HLASM Parser support implementation: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html Reviewed By: scott.linder, Kai Differential Revision: https://reviews.llvm.org/D97703	2021-03-12 11:56:11 -05:00
Matt Arsenault	6b76d82853	GlobalISel: Fix marking byval arguments as immutable byval arguments need to be assumed writable. Only implicitly stack passed arguments which aren't addressable in the IR can be assumed immutable. Mips is still broken since for some reason its doing its own thing with the ValueHandlers (and x86 doesn't actually handle byval arguments now, although some of the code is there).	2021-03-12 09:01:53 -05:00
serge-sans-paille	bc4a5bdce4	[NFC] Use StringRef instead of const char* for AsmPrinter This avoids calling strlen to repeatedly compute some string size.	2021-03-12 14:39:25 +01:00
Hans Wennborg	f50aef745c	Revert "[InstrProfiling] Don't generate __llvm_profile_runtime_user" This broke the check-profile tests on Mac, see comment on the code review. > This is no longer needed, we can add __llvm_profile_runtime directly > to llvm.compiler.used or llvm.used to achieve the same effect. > > Differential Revision: https://reviews.llvm.org/D98325 This reverts commit `c7712087cb`. Also reverting the dependent follow-up commit: Revert "[InstrProfiling] Generate runtime hook for ELF platforms" > When using -fprofile-list to selectively apply instrumentation only > to certain files or functions, we may end up with a binary that doesn't > have any counters in the case where no files were selected. However, > because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the > runtime would still be pulled in and incur some non-trivial overhead, > especially in the case when the continuous or runtime counter relocation > mode is being used. A better way would be to pull in the profile runtime > only when needed by declaring the __llvm_profile_runtime symbol in the > translation unit only when needed. > > This approach was already used prior to `9a041a7522`, but we changed it > to always generate the __llvm_profile_runtime due to a TAPI limitation. > Since TAPI is only used on Mach-O platforms, we could use the early > emission of __llvm_profile_runtime there, and on other platforms we > could change back to the earlier approach where the symbol is generated > later only when needed. We can stop passing -u__llvm_profile_runtime to > the linker on Linux and Fuchsia since the generated undefined symbol in > each translation unit that needed it serves the same purpose. > > Differential Revision: https://reviews.llvm.org/D98061 This reverts commit `87fd09b25f`.	2021-03-12 13:53:46 +01:00
Serguei Katkov	cfe8f8e0f0	Revert "Mark gc.relocate and gc.result as readnone" As readnone function they become movable and LICM can hoist them out of a loop. As a result in LCSSA form phi node of type token is created. No one is ready that GCRelocate first operand is phi node but expects to be token. GVN test were also updated, it seems it does not do what is expected. Test for LICM is also added. This reverts commit `f352463ade`.	2021-03-12 16:59:17 +07:00
Johannes Doerfert	0fe0d114e4	Revert "[OpenMP] Introduce the `disable_selector_propagation` variant selector trait" Need to revert `ad9e98b8ef` which this commit depends on. This reverts commit f771ef7b5f0ed260d00931cd50e6fe462edbacaf.	2021-03-11 23:48:35 -06:00
Johannes Doerfert	b2642456ab	[OpenMP] Introduce the `disable_selector_propagation` variant selector trait Nested `omp [begin\|end] declare variant` inherit the selectors from surrounding `omp (begin\|end) declare variant` constructs. To stop such propagation the user can add the `disable_selector_propagation` to the `extension` set in the `implementation` selector. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D95765	2021-03-11 23:31:25 -06:00
Carl Ritson	c07f2025e4	[AMDGPU] Restrict image_msaa_load to MSAA dimension types This instruction is only valid on 2D MSAA and 2D MSAA Array surfaces. Remove intrinsic support for other dimension types, and block assembly for unsupported dimensions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D98397	2021-03-12 09:47:24 +09:00
Florian Hahn	c92ec0dd92	[Matrix] Add support for matrix-by-scalar division. This patch extends the matrix spec to allow matrix-by-scalar division. Originally support for `/` was left out to avoid ambiguity for the matrix-matrix version of `/`, which could either be elementwise or specified as matrix multiplication M1 * (1/M2). For the matrix-scalar version, no ambiguity exists; `*` is also an elementwise operation in that case. Matrix-by-scalar division is commonly supported by systems including Matlab, Mathematica or NumPy. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97857	2021-03-11 22:21:23 +00:00
Wenlei He	051f2c144e	[SamplePGO] Skip inlinee profile scaling for sample loader inlining For CGSCC inline, we need to scale down a function's branch weights and entry counts when thee it's inlined at a callsite. This is done through updateCallProfile. Additionally, we also scale the weigths for the inlined clone based on call site count in updateCallerBFI. Neither is needed for inlining during sample profile loader as it's using context profile that is separated from inlinee's own profile. This change skip the inlinee profile scaling for sample loader inlining. Differential Revision: https://reviews.llvm.org/D98187	2021-03-11 10:18:26 -08:00
David Green	fad70c3068	[ARM] Improve WLS lowering Recently we improved the lowering of low overhead loops and tail predicated loops, but concentrated first on the DLS do style loops. This extends those improvements over to the WLS while loops, improving the chance of lowering them successfully. To do this the lowering has to change a little as the instructions are terminators that produce a value - something that needs to be treated carefully. Lowering starts at the Hardware Loop pass, inserting a new llvm.test.start.loop.iterations that produces both an i1 to control the loop entry and an i32 similar to the llvm.start.loop.iterations intrinsic added for do loops. This feeds into the loop phi, properly gluing the values together: %wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div) %wls0 = extractvalue { i32, i1 } %wls, 0 %wls1 = extractvalue { i32, i1 } %wls, 1 br i1 %wls1, label %loop.ph, label %loop.exit ... loop: %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ] .. %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1) %cmp = icmp ne i32 %iv.next, 0 br i1 %cmp, label %loop, label %loop.exit The llvm.test.start.loop.iterations need to be lowered through ISel lowering as a pair of WLS and WLSSETUP nodes, which each get converted to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent t2WhileLoopStart from being a terminator that produces a value, something difficult to control at that stage in the pipeline. Instead the t2WhileLoopSetup produces the value of LR (essentially acting as a lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc). These are then converted into a single t2WhileLoopStartLR at the same point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop to prevent them from progressing further in the pipeline. The t2WhileLoopStartLR is a single instruction that takes a GPR and produces LR, similar to the WLS instruction. %1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3 t2B %bb.1 ... bb.2.loop: %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2 ... %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2 t2B %bb.3 The t2WhileLoopStartLR can then be treated similar to the other low overhead loop pseudos, eventually being lowered to a WLS providing the branches are within range. Differential Revision: https://reviews.llvm.org/D97729	2021-03-11 17:56:19 +00:00
Hiroshi Yamauchi	365b225d46	[PGO] Fix two issues in PGOMemOPSizeOpt. 1. PGOMemOPSizeOpt grabs only the first, up to five (by default) entries from the value profile metadata and preserves the remaining entries for the fallback memop call site. If there are more than five entries, the rest of the entries would get dropped. This is fine for PGOMemOPSizeOpt itself as it only promotes up to 3 (by default) values, but potentially not for other downstream passes that may use the value profile metadata. 2. PGOMemOPSizeOpt originally assumed that only values 0 through 8 are kept track of. When the range buckets were introduced, it was changed to skip the range buckets, but since it does not grab all entries (only five), if some range buckets exist in the first five entries, it could potentially cause fewer promotion opportunities (eg. if 4 out of 5 were range buckets, it may be able to promote up to one non-range bucket, as opposed to 3.) Also, combined with 1, it means that wrong entries may be preserved, as it didn't correctly keep track of which were entries were skipped. To fix this, PGOMemOPSizeOpt now grabs all the entries (up to the maximum number of value profile buckets), keeps track of which entries were skipped, and preserves all the remaining entries. Differential Revision: https://reviews.llvm.org/D97592	2021-03-11 09:53:05 -08:00
Nikita Popov	6312c53870	[IRBuilder] Deprecate CreateLoad APIs with implicit type These APIs are not compatible with opaque pointers. Deprecate them to avoid the introduction of further uses.	2021-03-11 18:46:45 +01:00
Craig Topper	e9426dfbae	[ValueTypes][RISCV] Add MVT for v1f16. RISCV makes all fixed vector MVTs with size less than or equal to a command line option legal. This didn't include v1f16 because it was missing but did include v1f32 and v1f64. One test is affected where we did test this type, but it is a horizontal reduction so it is non-sensical. Perhaps we should canonicalize that away somewhere. I'm not sure if we should be making v1 types legal, but this will at least make RISCV consistent across all types. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98365	2021-03-11 09:23:18 -08:00
Joseph Huber	807466ef28	[OpenMP] Restore backwards compatibility for libomptarget Summary: The changes introduced in D87946 changed the API for libomptarget functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x but was not given a backward-compatible interface. This change will require people using Clang 13.x or 12.x to recompile their offloading programs. Reviewed By: jdoerfert cchen Differential Revision: https://reviews.llvm.org/D98358	2021-03-11 09:52:11 -05:00
Stephen Tozer	f40976bd01	Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reverts commit `c0f3dfb9f1`. Reverted due to an error on the clang-x64-windows-msvc buildbot.	2021-03-11 14:48:01 +00:00
Simon Pilgrim	cc48b45d24	[llvm-mca] Fix uninitialized variable in InOrderIssueStage constructor warning. NFCI.	2021-03-11 14:41:20 +00:00
Simon Pilgrim	9a259f4386	[Transforms] SampleProfileLoaderBaseImpl<BT>::getFunctionLoc - fix Wdocumentation warnings. NFCI.	2021-03-11 14:04:08 +00:00
gbtozers	c0f3dfb9f1	[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions. These salvages produce DIArgList locations and are only valid for dbg.values, as currently variadic DIExpressions must use DW_OP_stack_value. This functionality is also only added for salvageDebugInfoForDbgValues; other functions that directly call salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be updated in a later patch. Differential Revision: https://reviews.llvm.org/D91722	2021-03-11 13:33:49 +00:00
Serguei Katkov	0480927712	[Statepoint Lowering] Handle the case with several gc.result Recently gc.result has been marked with readnone instead of readonly and this opens a door for different optimization to duplicate gc.result. Statepoint lowering is not ready to see several gc.results. The problem appears when there are gc.results with one located in the same basic block and another located in other basic block. In this case we need both export VR and fill local setValue. Note that this case is not sufficient optimization done before CodeGen. It is evident that local gc.result dominates all other gc.results and it is handled by GVN and EarlyCSE. But anyway, even if IR is not optimal Backend should not crash on a valid IR. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98393	2021-03-11 18:44:44 +07:00
Simon Pilgrim	e74d626925	[IPO] Fix EXPENSIVE_CHECKS assert added at D83744. NFCI. It wasn't taking into account that QueryingAA was a pointer.	2021-03-11 10:29:15 +00:00
Nikita Popov	403da6a69a	Reapply [LICM] Make promotion faster Relative to the previous implementation, this always uses aliasesUnknownInst() instead of aliasesPointer() to correctly handle atomics. The added test case was previously miscompiled. ----- Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite the existing access count limit. This patch optimizes the identification of promotable stores. The idea here is pretty simple: We're only interested in must-alias mod sets of loop invariant pointers. As such, only populate the AST with loop-invariant loads and stores (anything else is definitely not promotable) and then discard any sets which alias with any of the remaining, definitely non-promotable accesses. If we promoted something, check whether this has made some other accesses loop invariant and thus possible promotion candidates. This is much faster in practice, because we need to perform AA queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable) instead of O(NumTotal^2), and NumPromotable tends to be small. Additionally, promotable accesses have loop invariant pointers, for which AA is cheaper. This has a signicant positive compile-time impact. We save ~1.8% geomean on CTMark at O3, with 6% on lencod in particular and 25% on individual files. Conceptually, this change is NFC, but may not be so in practice, because the AST is only an approximation, and can produce different results depending on the order in which accesses are added. However, there is at least no impact on the number of promotions (licm.NumPromoted) in test-suite O3 configuration with this change. Differential Revision: https://reviews.llvm.org/D89264	2021-03-11 10:50:28 +01:00
Djordje Todorovic	9f41c03f82	[Debugify][OriginalDIMode] Export the report into JSON file By using the original-di check with debugify in the combination with the llvm/utils/llvm-original-di-preservation.py it becomes very user friendly tool. An example of the HTML page with the issues related to debug info can be found at [0]. [0] https://djolertrk.github.io/di-checker-html-report-example/ Differential Revision: https://reviews.llvm.org/D82546	2021-03-11 01:11:13 -08:00
Petr Hosek	c7712087cb	[InstrProfiling] Don't generate __llvm_profile_runtime_user This is no longer needed, we can add __llvm_profile_runtime directly to llvm.compiler.used or llvm.used to achieve the same effect. Differential Revision: https://reviews.llvm.org/D98325	2021-03-10 22:33:51 -08:00
Reid Kleckner	b69db4a7ab	Re-land "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit `bacf9cf2c5` and reinstates commit `1a9bd5b813`. Reverting this commit did not appear to make the problem go away, so we can go ahead and reland it.	2021-03-10 15:14:09 -08:00
kuterd	d75c9e61a5	[Attributor] Attributor call site specific AAValueConstantRange This patch makes uses of the context bridges introduced in D83299 to make AAValueConstantRange call site specific. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83744	2021-03-11 01:19:44 +03:00
Quentin Colombet	49942c6d4a	[NFC] Fix a compiler warning Fix a warning caused by -Wrange-loop-analysis Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98297	2021-03-10 13:28:53 -08:00
Alexey Lapshin	4f16e177e1	[llvm-objcopy][NFC] replace class Buffer/MemBuffer/FileBuffer with streams. During D88827 it was requested to remove the local implementation of Memory/File Buffers: // TODO: refactor the buffer classes in LLVM to enable us to use them here // directly. This patch uses raw_ostream instead of Buffers. Generally, using streams could allow us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer. Thus, this patch uses raw_ostream as an interface for output data: Error executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out); Note 1. This patch does not change the implementation of Writers so that data would be directly stored into raw_ostream. This is assumed to be done later. Note 2. It would be better if Writers would be implemented in a such way that data could be streamed without seeking/updating. If that would be inconvenient then raw_ostream could be replaced with raw_pwrite_stream to have a possibility to seek back and update file headers. This is assumed to be done later if necessary. Note 3. Current FileOutputBuffer allows using a memory-mapped file. The raw_fd_ostream (which could be used if data should be stored in the file) does not allow us to use a memory-mapped file. Memory map functionality could be implemented for raw_fd_ostream: It is possible to add resize() method into raw_ostream. class raw_ostream { void resize(uint64_t size); } That method, implemented for raw_fd_ostream, could create a memory-mapped file. The streamed data would be written into that memory file then. Thus we would be able to use memory-mapped files with raw_fd_ostream. This is assumed to be done later if necessary. Differential Revision: https://reviews.llvm.org/D91028	2021-03-10 23:50:04 +03:00
Sriraman Tallam	0ba1ebcbb7	Remove original implementation of UniqueInternalLinkageNames pass. D96109 was recently submitted which contains the refactored implementation of -funique-internal-linakge-names by adding the unique suffixes in clang rather than as an LLVM pass. Deleting the former implementation in this change. Differential Revision: https://reviews.llvm.org/D98234	2021-03-10 11:57:40 -08:00
Quentin Colombet	66dab2fa84	[NFC] Fix compiler warnings Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298	2021-03-10 11:03:50 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	0c73a506e8	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Ta-Wei Tu	7ff2768be1	Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`" This reverts commit `df9158c9a4`.	2021-03-11 01:24:43 +08:00
Stephen Tozer	1db137b185	[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578	2021-03-10 17:15:24 +00:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Christudasan Devadasan	4c6ab48fb1	GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM It is good to have a combined `divrem` instruction when the `div` and `rem` are computed from identical input operands. Some targets can lower them through a single expansion that computes both division and remainder. It effectively reduces the number of instructions than individually expanding them. Reviewed By: arsenm, paquette Differential Revision: https://reviews.llvm.org/D96013	2021-03-10 18:46:07 +05:30
Alex Richardson	35bf23e965	Avoid shuffle self-assignment in EXPENSIVE_CHECKS builds Some versions of libstdc++ perform self-assignment in std::shuffle. This breaks the EXPENSIVE_CHECKS builds of TableGen due to an incorrect assertion in libstdc++. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85828. Fixes https://llvm.org/PR37652 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98167	2021-03-10 11:17:34 +00:00
Vladislav Vinogradov	dc8446c2a0	[ADT][NFC] Use `size_t` type for index in `indexed_accessor_range` It makes it consistent with `size()` method return type and with STL-like containers API. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D97921	2021-03-10 11:40:59 +03:00
Wei Mi	ee35784a90	[SampleFDO] Support enabling -funique-internal-linkage-name. now -funique-internal-linkage-name flag is available, and we want to flip it on by default since it is beneficial to have separate sample profiles for different internal symbols with the same name. As a preparation, we want to avoid regression caused by the flip. When we flip -funique-internal-linkage-name on, the profile is collected from binary built without -funique-internal-linkage-name so it has no uniq suffix, but the IR in the optimized build contains the suffix. This kind of mismatch may introduce transient regression. To avoid such mismatch, we introduce a NameTable section flag indicating whether there is any name in the profile containing uniq suffix. Compiler will decide whether to keep uniq suffix during name canonicalization depending on the NameTable section flag. The flag is only available for extbinary format. For other formats, by default compiler will keep uniq suffix so they will only experience transient regression when -funique-internal-linkage-name is just flipped. Another type of regression is caused by places where we miss to call getCanonicalFnName. Those places are fixed. Differential Revision: https://reviews.llvm.org/D96932	2021-03-09 21:41:40 -08:00
Amara Emerson	55e760769b	[GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted. If every element is extracted from a G_BUILD_VECTOR, pass through the source registers. This is different to the extract(build_vector) combine because this one tolerates multiple users as long as they're exhaustive. Differential Revision: https://reviews.llvm.org/D97890	2021-03-09 11:34:26 -08:00
Amara Emerson	e60ab72137	[AArch64][GlobalISel] Add combine for extract_vector_elt(build_vector, cst) Differential Revision: https://reviews.llvm.org/D97835	2021-03-09 11:08:02 -08:00
Fangrui Song	42e3f97a9d	[MC] Change ELFOSABI_NONE to ELFOSABI_GNU for SHF_GNU_RETAIN GNU ld does not give SHF_GNU_RETAIN GC root semantics for ELFOSABI_NONE. (https://sourceware.org/pipermail/binutils/2021-March/115581.html) This allows GNU ld to interpret SHF_GNU_RETAIN and avoids a gold quirk https://sourceware.org/bugzilla/show_bug.cgi?id=27490 Because ELFObjectWriter is in an anonymous namespace, I have to place `markGnuAbi` in the parent MCObjectWriter. Differential Revision: https://reviews.llvm.org/D97976	2021-03-09 09:59:47 -08:00
Nikita Popov	55ae279ba7	[FastISel] Don't trivially kill extractvalues (PR49467) All extractvalues of the same value at the same index will map to the same register, so even if one specific extractvalue only has one use, we should not mark it as a trivial kill, as there may be more extractvalues later. Fixes https://bugs.llvm.org/show_bug.cgi?id=49467. Differential Revision: https://reviews.llvm.org/D98145	2021-03-09 18:46:38 +01:00
gbtozers	f0513413c7	[DebugInfo] Add replaceArg function to simplify DBG_VALUE_LIST expressions The LiveDebugValues and LiveDebugVariables implementations for handling DBG_VALUE_LIST instructions can be simplified significantly if they do not have to deal with any duplicated operands, such as a DBG_VALUE_LIST that uses the same register multiple times in its expression. This patch adds a function, replaceArg, that can be used to simplify a DIExpression in the case of duplicated operands. Differential Revision: https://reviews.llvm.org/D83896	2021-03-09 17:41:04 +00:00
Dave Lee	736afe465f	Revert "[build][modules] Fix ObjCARCUtil.h modularization" This reverts commit `f1b690598e`.	2021-03-09 09:36:47 -08:00
gbtozers	df69c69427	[DebugInfo] Handle multiple variable location operands in IR This patch updates the various IR passes to correctly handle dbg.values with a DIArgList location. This patch does not actually allow DIArgLists to be produced by salvageDebugInfo, and it does not affect any pass after codegen-prepare. Other than that, it should cover every IR pass. Most of the changes simply extend code that operated on a single debug value to operate on the list of debug values in the style of any_of, all_of, for_each, etc. Instances of setOperand(0, ...) have been replaced with with replaceVariableLocationOp, which takes the value that is being replaced as an additional argument. In places where this value isn't readily available, we have to track the old value through to the point where it gets replaced. Differential Revision: https://reviews.llvm.org/D88232	2021-03-09 16:44:38 +00:00
Stefan Gränitz	265bc5af7b	[Orc] Always check mapped sections for ELFDebugObject are in bounds of working memory buffer As stated in the JITLink user guide: Do not assume that the input object is well formed. https://llvm.org/docs/JITLink.html#tips-for-jitlink-backend-developers	2021-03-09 14:01:50 +01:00
Cullen Rhodes	2750f3ed31	[IR] Introduce llvm.experimental.vector.splice intrinsic This patch introduces a new intrinsic @llvm.experimental.vector.splice that constructs a vector of the same type as the two input vectors, based on a immediate where the sign of the immediate distinguishes two variants. A positive immediate specifies an index into the first vector and a negative immediate specifies the number of trailing elements to extract from the first vector. For example: @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count These intrinsics support both fixed and scalable vectors, where the former is lowered to a shufflevector to maintain existing behaviour, although while marked as experimental the recommended way to express this operation for fixed-width vectors is to use shufflevector. For scalable vectors where it is not possible to express a shufflevector mask for this operation, a new ISD node has been implemented. This is one of the named shufflevector intrinsics proposed on the mailing-list in the RFC at [1]. Patch by Paul Walker and Cullen Rhodes. [1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D94708	2021-03-09 10:44:22 +00:00
gbtozers	93b170ea24	[DebugInfo] Handle dbg.values with multiple variable location operands in ISel This patch adds partial support in Instruction Selection for dbg.values that use a DIArgList. This patch does not add support for producing DBG_VALUE_LIST, but adds the logic for processing DIArgLists within the ISel pass. This change is largely focused on handleDebugValue and some of the functions that it calls. Outside of this, salvageDebugInfo and transferDbgValues have been modified to replace individual operands instead of the entire value; dangling debug info for variadic debug values is not currently supported (but may be added later). Differential Revision: https://reviews.llvm.org/D88589	2021-03-09 09:48:03 +00:00
Jan Kratochvil	4289a7f1d7	llvm-dwarfdump: Fix DWARF-5 DW_FORM_implicit_const (used by GCC) Differential Revision: https://reviews.llvm.org/D98195	2021-03-09 09:26:58 +01:00
Akira Hatanaka	dca5737945	Move ObjCARCUtil.h back to llvm/Analysis Instead of adding the header to llvm/IR, just duplicate the marker string in the auto upgrader.	2021-03-08 16:35:24 -08:00
Rahman Lavaee	c245c21c43	[llvm-readelf] Support dumping the BB address map section with --bb-addr-map. This patch lets llvm-readelf dump the content of the BB address map section in the following format: ``` Function { At: <address> BB entries [ { Offset: <offset> Size: <size> Metadata: <metadata> }, ... ] } ... ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D95511	2021-03-08 16:20:11 -08:00
Dave Lee	f1b690598e	[build][modules] Fix ObjCARCUtil.h modularization This change was introduced by this fix: `a2a55def35`. This makes a submodule for ObjCARCUtil.h, because leaving it in LLVM_intrinsic_gen causes a dependency cycle between LLVM_IR and LLVM_intrinsic_gen.	2021-03-08 15:51:27 -08:00
Benjamin Kramer	0d96ea0792	[ValueTracking] Move matchSimpleRecurrence out of line The header only has a forward declaration of PHINode available, and this function doesn't seem to get much out of inlining.	2021-03-09 00:04:47 +01:00
Sanjay Patel	34d0d644ff	[ValueTracking] move/add helper to get inverse min/max; NFC We will need to this functionality to improve min/max folds in instcombine when we canonicalize to intrinsics.	2021-03-08 17:38:22 -05:00
wlei	c460ef61d6	[CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen Differential Revision: https://reviews.llvm.org/D96811	2021-03-08 14:36:02 -08:00
Masoud Ataei	820f508b08	[PowerPC] Removing _massv place holder Since P8 is the oldest machine supported by MASSV pass, _massv place holder is removed and the oldest version of MASSV functions is assumed. If the P9 vector specific is detected in the compilation process, the P8 prefix will be updated to P9. Differential Revision: https://reviews.llvm.org/D98064	2021-03-08 21:43:24 +00:00
Jessica Paquette	5c26be214d	[AArch64][GlobalISel] Lower G_BUILD_VECTOR -> G_DUP If we have ``` %vec = G_BUILD_VECTOR %reg, %reg, ..., %reg ``` Then lower it to ``` %vec = G_DUP %reg ``` Also update the selector to handle constant splats on G_DUP. This will not combine when the splat is all zeros or ones. Tablegen-imported patterns rely on these being G_BUILD_VECTOR. Minor code size improvements on CTMark at -Os. Also adds some utility functions to make it a bit easier to recognize splats, and an AArch64-specific splat helper. Differential Revision: https://reviews.llvm.org/D97731	2021-03-08 13:01:10 -08:00
Alina Sbirlea	29482426b5	Revert "[LICM] Make promotion faster" Revert `3d8f842712` Revision triggers a miscompile sinking a store incorrectly outside a threading loop. Detected by tsan. Reverting while investigating. Differential Revision: https://reviews.llvm.org/D89264	2021-03-08 12:53:03 -08:00
Min-Yih Hsu	8dddc15297	[M68k](4/8) MC layer and object file support - Add the M68k-specific MC layer implementation - Add ELF support for M68k - Add M68k-specifc CC and reloc TODO: Currently AsmParser and disassembler are not implemented yet. Please use this bug to track the status: https://bugs.llvm.org/show_bug.cgi?id=48976 Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88390	2021-03-08 12:30:57 -08:00
Min-Yih Hsu	bec7b16692	[M68k](3/8) Skeleton and target description files - Infrastructure for the target (i.e. build files, target triple etc.) - All of the target description TableGen file Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88389	2021-03-08 12:30:57 -08:00
Min-Yih Hsu	6dcc325ce0	[M68k][MIR](2/8) Changes in the target-independent MIR part - Add new callback in `TargetInstrInfo` -- `isPCRelRegisterOperandLegal` -- to query whether pc-rel register MachineOperand is legal. - Add new function to search DebugLoc in a reverse ordering Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88386	2021-03-08 12:30:57 -08:00
Min-Yih Hsu	503343191e	[M68k][TableGen](1/8) TableGen related changes - Add a new TableGen backend: CodeBeads - Add support to generate logical operand information For the first item, it is currently a workaround of M68k's (complex) instruction encoding. A typical architecture, especially CISC one like X86, normally uses `MCInstrDesc::TSFlags` to carry instruction encoding info. However, at the early days of M68k backend development, we found it difficult to fit every possible encoding into the 64-bit `MCInstrDesc::TSFlags`. Therefore CodeBeads was invented to provide an alternative, arbitrary length container for instruciton encoding info. However, in the long term we incline not to use a new TG backend for less common pattern like what we encountered in M68k. A bug has been created to host to discussion on migrating from CodeBeads to more concise solution: https://bugs.llvm.org/show_bug.cgi?id=48792 The second item was also served for similar purpose. It created utility functions that tell you the index of a `MachineOperand` in a `MachineInst` given a logical operand index. In normal cases a logical operand is the same as `MachineOperand`, but for operands using complex addressing mode a logical operand might be consisting of multiple `MachineOperand`. The TableGen-ed `getLogicalOperandIdx`, for instance, can give you the mapping between these two concepts. Nevertheless, we hope to remove this feature in the future if possible. Since it's not really useful for the targets supported by LLVM now either. Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88385	2021-03-08 12:30:56 -08:00
Victor Huang	621023b218	[AIX][TLS] Add assert check of valid csect type for the storage mapping class XCOFF::XMC_UL This patch adds the assert check inside the constructor for the csect (MCSectionXCOFF) to ensure valid csect type used for the storage mappping class XCOFF:XMC_UL.	2021-03-08 14:18:57 -06:00
Yuta Saito	aa0c571a5f	[WebAssembly] Add new relocation for location relative data This `R_WASM_MEMORY_ADDR_SELFREL_I32` relocation represents an offset between its relocating address and the symbol address. It's very similar to `R_X86_64_PC32` but restricted to be used for only data segments. ``` S + A - P ``` A: Represents the addend used to compute the value of the relocatable field. P: Represents the place of the storage unit being relocated. S: Represents the value of the symbol whose index resides in the relocation entry. Proposal: https://github.com/WebAssembly/tool-conventions/issues/162 Differential Revision: https://reviews.llvm.org/D96659	2021-03-08 11:34:10 -08:00
Philip Reames	d9a29a6752	constify getUnderlyingObject implementation [nfc]	2021-03-08 11:32:54 -08:00
Philip Reames	239a618180	[instcombine] Collapse trivial and recurrences If we have a recurrence of the form <Start, And, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence. Differential Revision: https://reviews.llvm.org/D97578 (and part)	2021-03-08 09:21:38 -08:00
Philip Reames	97a7bc5831	[gvn] Precisely propagate equalities to phi operands The code used for propagating equalities (e.g. assume facts) was conservative in two ways - one of which this patch fixes. Specifically, it shifts the code reasoning about whether a use is dominated by the end of the assume block to consider phi uses to exist on the predecessor edge. This matches the dominator tree handling for dominates(Edge, Use), and simply extends it to dominates(BB, Use). Note that the decision to use the end of the block is itself a conservative choice. The more precise option would be to use the later of the assume and the value, and replace all uses after that. GVN handles that case separately (with the replace operand mechanism) because it used to be expensive to ask dominator questions within blocks. With the new instruction ordering support, we should probably rewrite this code at some point to simplify. Differential Revision: https://reviews.llvm.org/D98082	2021-03-08 08:59:00 -08:00
gbtozers	e5d958c456	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
Ta-Wei Tu	df9158c9a4	[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest` The check `tightlyNested()` in `LoopInterchange` is similar to the one in `LoopNest`. In fact, the former misses some cases where loop-interchange is not feasible and results in incorrect behaviour. Replacing it with the much robust version provided by `LoopNest` reduces code duplications and fixes https://bugs.llvm.org/show_bug.cgi?id=48113. `LoopInterchange` has a weaker definition of tightly or perfectly nesting-ness than the one implemented in `LoopNest::arePerfectlyNested()`. Therefore, `tightlyNested()` is instead implemented with `LoopNest::checkLoopsStructure` and additional checks for unsafe instructions. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D97290	2021-03-08 11:36:08 +08:00
Mehdi Amini	fe9a4b55da	Fix build post-revert in `8d5a981a13` One commit introduced after the reverted change was using an API introduced there, this is reintroducing the API, but not the original broken change.	2021-03-08 00:59:05 +00:00
Mehdi Amini	8d5a981a13	Revert "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe" This reverts commit `99108c791d`. Clang is miscompiling LLVM with this change, a stage-2 build hits multiple failures. As a repro, I built clang in a stage1 directory and used it this way: cmake -G Ninja ../llvm \ -DCMAKE_CXX_COMPILER=`pwd`/../build-stage1/bin/clang++ \ -DCMAKE_C_COMPILER=`pwd`/../build-stage1/bin/clang \ -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" \ -DLLVM_ENABLE_PROJECTS=mlir \ -DLLVM_BUILD_EXAMPLES=ON \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=On ninja check-mlir	2021-03-08 00:15:47 +00:00
Juneyoung Lee	99108c791d	[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe This patch makes FoldBranchToCommonDest merge branch conditions into `select i1` rather than `and/or i1` when it is called by SimplifyCFG. It is known that merging conditions into and/or is poison-unsafe, and this is towards making things more correct by removing possible miscompilations. Currently, InstCombine simply consumes these selects into and/or of i1 (which is also unsafe), so the visible effect would be very small. The unsafe select -> and/or transformation will be removed in the future. There has been efforts for updating optimizations to support the select form as well, and they are linked to D93065. The safe transformation is fired when it is called by SimplifyCFG only. This is done by setting the new `PoisonSafe` argument as true. Another place that calls FoldBranchToCommonDest is LoopSimplify. `PoisonSafe` flag is set to false in this case because enabling it has a nontrivial impact in performance because SCEV is more conservative with select form and InductiveRangeCheckElimination isn't aware of select form of and/or i1. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95026	2021-03-08 01:38:03 +09:00
Fangrui Song	bb6732cf62	[MC] Add parseEOL() overload and migrate some parseToken(AsmToken::EndOfStatement) to parseEOL() For many directives, the following diagnostics * `error: unexpected token` * `error: unexpected token in '.abort' directive"` are replaced with `error: expected newline`. `unexpected token` may make the user think a different token is needed. `expected newline` is clearer about the expected token. For `in '...' directive`, the directive name is not useful because the next line replicates the error line which includes the directive.	2021-03-06 17:45:23 -08:00
Fangrui Song	d96af2ed2d	[MC] Support .symver , , remove As a resolution to https://sourceware.org/bugzilla/show_bug.cgi?id=25295 , GNU as from binutils 2.35 supports the optional third argument for the .symver directive. 'remove' for a non-default version is useful: `.symver def_v1, def@v1, remove` => def_v1 is not retained in the symbol table. Previously the user has to strip the original symbol or specify a `local:` version node in a version script to localize the symbol. `.symver def, def@@v1, remove` and `.symver def, def@@@v1, remove` are supported as well, though they are identical to `.symver def, def@@@v1`. local/hidden are not useful so this patch does not implement them.	2021-03-06 15:23:02 -08:00
Vy Nguyen	f8b01d54c3	Reland `293e8fa13d` [llvm-exegesis] Disable the LBR check on AMD https://bugs.llvm.org/show_bug.cgi?id=48918 The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW. Theory we've got is that the lbr-checking code could be confused on AMD. Differential Revision: https://reviews.llvm.org/D97504 New change: - Surround usages of x86 helper in llvm-exegesis/X86/Target.cpp with ifdef - Fix bug which caused the caller of getVendorSignature to not have a copy of EAX that it expected.	2021-03-05 13:23:42 -05:00
Philip Reames	f352463ade	Mark gc.relocate and gc.result as readnone For some reason, we had been marking gc.relocates as reading memory. There's no known reason for this, and I suspect it to be a legacy of very early implementation conservatism. gc.relocate and gc.result are simply projections of the return values from the associated statepoint. Note that the LangRef has always declared them readnone. The EarlyCSE change is simply moving the special casing from readonly to readnone handling. As noted by the test diffs, this does allow some additional CSE when relocates are separated by stores, but since we generate gc.relocates in batches, this is unlikely to help anything in practice. This was reviewed as part of https://reviews.llvm.org/D97974, but split at reviewer request before landing. The motivation is to enable the GVN changes in that patch.	2021-03-05 10:07:17 -08:00
gbtozers	65600cb2a7	[DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics This patch adds a new metadata node, DIArgList, which contains a list of SSA values. This node is in many ways similar in function to the existing ValueAsMetadata node, with the difference being that it tracks a list instead of a single value. Internally, it uses ValueAsMetadata to track the individual values, but there is also a reasonable amount of DIArgList-specific value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special case in parsing and printing due to the fact that it requires a function state (as it may reference function-local values). This patch should not result in any immediate functional change; it allows for DIArgLists to be parsed and printed, but debug variable intrinsics do not yet recognize them as a valid argument (outside of parsing). Differential Revision: https://reviews.llvm.org/D88175	2021-03-05 17:02:24 +00:00
Stephen Tozer	f677413071	Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" Rewrites test to use correct architecture triple; fixes incorrect reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour so as to not be dependent on changes elsewhere in the patch stack. This reverts commit `d2000b45d0`.	2021-03-05 12:32:05 +00:00
Jingu Kang	9b302513f6	[AArch64] Add missing intrinsics for vrnd	2021-03-05 11:26:12 +00:00
Simon Pilgrim	6955524c2f	Fix Wdocumentation unknown parameter warning. NFCI.	2021-03-05 11:24:44 +00:00
Simon Pilgrim	3fd2fa1220	Revert rG8198d83965ba4b9db6922b44ef3041030b2bac39: "[X86] Pass to transform amx intrinsics to scalar operation." This reverts commit 8198d83965ba4b9db6922b44ef3041030b2bac39.due to buildbot breakages	2021-03-05 11:09:14 +00:00
Andy Wingo	a5a3659de7	[WebAssembly][yaml2obj][obj2yaml] Elem sections for nonzero tables With reference types, tables can have non-zero table numbers. This commit adds support for element sections against these tables. Differential Revision: https://reviews.llvm.org/D97923	2021-03-05 11:45:15 +01:00
Petar Avramovic	d44f61f81c	Reland [GlobalISel] Combine zext(trunc x) to x Recommit `4112299ee7`. Depends on `4c8fb7ddd6` which was reverted. Combine zext(trunc x) to x when truncated bits are known to be zero. Differential Revision: https://reviews.llvm.org/D96031	2021-03-05 11:05:37 +01:00
Luo, Yuanke	8198d83965	[X86] Pass to transform amx intrinsics to scalar operation. This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D93594	2021-03-05 16:02:02 +08:00
Fangrui Song	063b19dea6	[DebugInfo] Delete unused DIVariable::getSource	2021-03-04 21:53:13 -08:00
Fangrui Song	9e28b89827	[DebugInfo] Delete deleted getLine/getColumn r250405 deleted the functions from DILexicalBlockBase.	2021-03-04 21:49:21 -08:00
Michael Kruse	b119120673	[clang][OpenMP] Use OpenMPIRBuilder for workshare loops. Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to: * Only the worksharing-loop directive. * Recognizes only the nowait clause. * No loop nests with more than one loop. * Untested with templates, exceptions. * Semantic checking left to the existing infrastructure. This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics: * The distance function: How many loop iterations there will be before entering the loop nest. * The loop variable function: Conversion from a logical iteration number to the loop variable. These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang. The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does. For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting. Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D94973	2021-03-04 22:52:59 -06:00
Wei Mi	2357d29335	[SampleFDO] Another fix to prevent repeated indirect call promotion in sample loader pass. In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9, to prevent repeated indirect call promotion for the same indirect call and the same target, we used zero-count value profile to indicate an indirect call has been promoted for a certain target. We removed PromotedInsns cache in the same patch. However, there was a problem in that patch described below, and that problem led me to add PromotedInsns back as a mitigation in https://reviews.llvm.org/rG4ffad1fb489f691825d6c7d78e1626de142f26cf. When we get value profile from metadata by calling getValueProfDataFromInst, we need to specify the maximum possible number of values we expect to read. We uses MaxNumPromotions in the last patch so the maximum number of value information extracted from metadata is MaxNumPromotions. If we have many values including zero-count values when we write the metadata, some of them will be dropped when we read them because we only read MaxNumPromotions values. It will allow repeated indirect call promotion again. We need to make sure if there are values indicating promoted targets, those values need to be saved in metadata with higher priority than other values. The patch fixed that problem. We change to use -1 to represent the count of a promoted target instead of 0 so it is easier to sort the values. When we prepare to update the metadata in updateIDTMetaData, we will sort the values in the descending count order and extract only MaxNumPromotions values to write into metadata. Since -1 is the max uint64_t number, if we have equal to or less than MaxNumPromotions of -1 count values, they will all be kept in metadata. If we have more than MaxNumPromotions of -1 count values, we will only save MaxNumPromotions such values maximally. In such case, we have logic in place in doesHistoryAllowICP to guarantee no more promotion in sample loader pass will happen for the indirect call, because it has been promoted enough. With this change, now we can remove PromotedInsns without problem. Differential Revision: https://reviews.llvm.org/D97350	2021-03-04 18:44:12 -08:00
Chen Zheng	87bbf3d1f8	[XCOFF][DebugInfo] support DWARF for XCOFF for assembly output. Reviewed By: jasonliu Differential Revision: https://reviews.llvm.org/D95518	2021-03-04 21:07:52 -05:00
David Blaikie	a2a55def35	Move llvm/Analysis/ObjCARCUtil.h to IR to fix layering. This is included from IR files, and IR doesn't/can't depend on Analysis (because Analysis depends on IR). Also fix the implementation - don't use non-member static in headers, as it leads to ODR violations, inaccurate "unused function" warnings, etc. And fix the header protection macro name (we don't generally include "LIB" in the names, so far as I can tell).	2021-03-04 16:14:53 -08:00
Francis Visoiu Mistrih	365b78396a	[Remarks] Emit variable info in auto-init remarks This enhances the auto-init remark with information about the variable that is auto-initialized. This is based of debug info if available, or alloca names (mostly for development purposes). ``` auto-init.c:4:7: remark: Call to memset inserted by -ftrivial-auto-var-init. Memory operation size: 4096 bytes.Variables: var (4096 bytes). [-Rpass-missed=annotation-remarks] int var[1024]; ^ ``` This allows to see things like partial initialization of a variable that the optimizer won't be able to completely remove. Differential Revision: https://reviews.llvm.org/D97734	2021-03-04 12:51:22 -08:00
Nicolas Guillemot	6b8cf7356c	Revert "[Support] Add raw_ostream_iterator: ostream_iterator for raw_ostream" This reverts commit `7479a2e00b`. This commit causes compile errors on clang-x64-windows-msvc, so I'm reverting the patch for now. For reference, the error in question is: ``` error C2280: 'llvm::raw_ostream_iterator<char,char> &llvm::raw_ostream_iterator<char,char>::operator =(const llvm::raw_ostream_iterator<char,char> &)': attempting to reference a deleted function note: compiler has generated 'llvm::raw_ostream_iterator<char,char>::operator =' here note: 'llvm::raw_ostream_iterator<char,char> &llvm::raw_ostream_iterator<char,char>::operator =(const llvm::raw_ostream_iterator<char,char> &)': function was implicitly deleted because 'llvm::raw_ostream_iterator<char,char>' has a data member 'llvm::raw_ostream_iterator<char,char>::OutStream' of reference type ```	2021-03-04 12:46:58 -08:00
Akira Hatanaka	1900503595	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `ed4718eccb`, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in `75805dce5f`. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Nicolas Guillemot	7479a2e00b	[Support] Add raw_ostream_iterator: ostream_iterator for raw_ostream Adds a class `raw_ostream_iterator` that behaves like std::ostream_iterator, but can be used with raw_ostream. This is useful for using raw_ostream with std algorithms. For example, it can be used to output std containers as follows: ``` std::vector<int> V = { 1, 2, 3 }; std::copy(V.begin(), V.end(), raw_ostream_iterator<int>(outs(), ", ")); // Output: "1, 2, 3, " ``` The API tries to follow std::ostream_iterator as closely as is practically possible. Reviewed By: dblaikie, mkitzan Differential Revision: https://reviews.llvm.org/D78795	2021-03-04 11:12:48 -08:00
Adrian Prantl	d268febc56	Improve the debug info for coro-split .resume functions This patch updates the scope line to point to the suspend point. This makes the first address in the function point to the first source line in the resume function rather than the function declaration. Without this the line table "jumps" from the beginning of the function to the suspend point at the beginning. rdar://73386346 Differential Revision: https://reviews.llvm.org/D97345	2021-03-04 11:05:35 -08:00
Albion Fung	36192790d8	[PowerPC][PC Rel] Implement option to omit Power10 instructions from stubs Implemented the option to omit Power10 instructions from save stubs via the option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will override the other option. --power10-stubs=auto also exists to use the default behaviour (ie allow Power10 instructions in stubs). Differential Revision: https://reviews.llvm.org/D94627	2021-03-04 13:27:46 -05:00
Nico Weber	76148caa50	Revert "[llvm-exegesis] Disable the LBR check on AMD" This reverts commit `293e8fa13d`. Breaks build on non-intel hosts, see e.g. http://45.33.8.238/macm1/4600/step_3.txt	2021-03-04 11:48:33 -05:00
Xiangling Liao	e9f9ec837d	[CMake][AIX] Adjust plugin library extension used on AIX As stated in the CMake manual, we are supposed to use MODULE rules to generate plugin libraries: "MODULE libraries are plugins that are not linked into other targets but may be loaded dynamically at runtime using dlopen-like functionality" Besides, LLVM's plugin infrastructure fits with the AIX treatment of .so shared objects more than it fits with the AIX treatment of .a library archives (which may contain shared objects). Differential revision: https://reviews.llvm.org/D96282	2021-03-04 11:23:06 -05:00
Vy Nguyen	293e8fa13d	[llvm-exegesis] Disable the LBR check on AMD https://bugs.llvm.org/show_bug.cgi?id=48918 The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW. Theory we've got is that the lbr-checking code could be confused on AMD. Differential Revision: https://reviews.llvm.org/D97504	2021-03-04 11:16:38 -05:00
Sanjay Patel	36a489d194	[Analysis][LoopVectorize] rename "Unsafe" variables/methods; NFC Similar to `b3a33553ae`, but this shows a TODO and a potential miscompile is already present. We are tracking an FP instruction that does not have FMF (reassoc) properties, so calling that "Unsafe" seems opposite of the common reading. I also removed one getter method by rolling the null check into the access. Further simplification may be possible. The motivation is to clean up the interactions between FMF and function-level attributes in these classes and their callers. The new test shows that there is an existing bug somewhere in the callers. We assumed that the original code was fully 'fast' and so we produced IR with 'fast' even though it was just 'reassoc'.	2021-03-04 10:40:26 -05:00
Nico Weber	59beb1ef6d	Revert "[GlobalISel] Combine zext(trunc x) to x" This reverts commit `4112299ee7`. Seems to depend on `4c8fb7ddd6` which is being reverted.	2021-03-04 10:13:40 -05:00
Petar Avramovic	4112299ee7	[GlobalISel] Combine zext(trunc x) to x Combine zext(trunc x) to x when truncated bits are known to be zero. Differential Revision: https://reviews.llvm.org/D96031	2021-03-04 15:05:23 +01:00
Sanjay Patel	b3a33553ae	[Analysis][LoopVectorize] rename "Unsafe" variables/methods; NFC We are tracking an FP instruction that does not have FMF (reassoc) properties, so calling that "Unsafe" seems opposite of the common reading. I also removed one getter method by rolling the null check into the access. Further simplification seems possible. The motivation is to clean up the interactions between FMF and function-level attributes in these classes and their callers.	2021-03-04 08:53:04 -05:00
Stephen Tozer	d2000b45d0	Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" This reverts commit `d07f106f4a`.	2021-03-04 11:59:21 +00:00
gbtozers	d07f106f4a	[DebugInfo] Add new instruction and DIExpression operator for variadic debug values This patch adds a new instruction that can represent variadic debug values, DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set of basic code changes in MachineInstr and a few adjacent areas, but does not correctly handle variadic debug values outside of these areas, nor does it generate them at any point. The new instruction is similar to the existing DBG_VALUE instruction, with the following differences: the operands are in a different order, any number of values may be used in the instruction following the Variable and Expression operands (these are referred to in code as “debug operands”) and are indexed from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the expression. The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the index given by the argument onto the Expression stack. For example the sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at index 0 onto the expression stack.” Differential Revision: https://reviews.llvm.org/D82363	2021-03-04 11:45:35 +00:00
Andrew Savonichev	d791695cb5	[MCA] Add support for in-order CPUs This patch adds a pipeline to support in-order CPUs such as ARM Cortex-A55. In-order pipeline implements a simplified version of Dispatch, Scheduler and Execute stages as a single stage. Entry and Retire stages are common for both in-order and out-of-order pipelines. Differential Revision: https://reviews.llvm.org/D94928	2021-03-04 14:08:19 +03:00
Fraser Cormack	c907681b07	[NFC] Fix typos in CallingConvLower.h	2021-03-04 10:29:14 +00:00
Hongtao Yu	c75da238b4	[CSSPGO] Deduplicating dangling pseudo probes. Same dangling probes are redundant since they all have the same semantic that is to rely on the counts inference tool to get reasonable count for the same original block. Therefore, there's no need to keep multiple copies of them. I've seen jump threading created tons of redundant dangling probes that slowed down the compiler dramatically. Other optimization passes can also result in redundant probes though without an observed impact so far. This change removes block-wise redundant dangling probes specifically introduced by jump threading. To support removing redundant dangling probes caused by all other passes, a final function-wise deduplication is also added. An 18% size win of the .pseudo_probe section was seen for SPEC2017. No performance difference was observed. Differential Revision: https://reviews.llvm.org/D97482	2021-03-03 22:44:42 -08:00
Hongtao Yu	8985515822	[CSSPGO] Unblocking optimizations by dangling pseudo probes. This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments. The optimizations being unblocked are: 1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted. 2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions. 3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks. Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D97481	2021-03-03 22:44:42 -08:00
Hongtao Yu	ad2a59f584	[CSSPGO] Introducing dangling pseudo probes. Dangling probes are the probes associated to an empty block. This usually happens when all real instructions are optimized away from the block. There is a problem with dangling probes during the offline counts processing. The way the sample profiler works is that samples collected on the first physical instruction following a probe will be counted towards the probe. This logically equals to treating the instruction next to a probe as if it is from the same block of the probe. In the dangling probe case, the real instruction following a dangling probe actually starts a new block, and samples collected on the new block may cause issues when counted towards the empty block. To mitigate this issue, we first try to move around a dangling probe inside its owning block. If there are still native instructions preceding the probe in the same block, we can then use them as a place holder to collect samples for the probe. A pass is added to walk each block backwards looking for probes not followed by any real instruction and moving them before the first real instruction. This is done right before the object emission. If we are unlucky to find such in-block preceding instructions for a probe, the solution we are taking is to tag such probe as dangling so that the samples reported for them will not be trusted by the compiler. We leave it up to the counts inference algorithm to get such probes a reasonable count. The number `UINT64_MAX` is used to mark sample count as collected for a dangling probe. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D95962	2021-03-03 22:44:41 -08:00
Johannes Doerfert	5b70c12f3e	[Attributor] Make DepClass a required argument We often used a sub-optimal dependence class in the past because we didn't see the argument. Let's make it explicit so we remember to think about it.	2021-03-04 00:35:52 -06:00
Johannes Doerfert	e592dad82e	[Attributor] Fold "TrackDependence" into the DepClassTy enum We don't need a bool and an enum to express the three options we currently have. This makes the interface nicer and much easier to use optional dependencies. Also avoids mistakes where the bool is false and enum ignored.	2021-03-04 00:35:52 -06:00
Johannes Doerfert	f3f88287c5	[Attributor] Use known alignment as lower bound to avoid work If we know already more than available from a use, we don't need to invest time on it.	2021-03-04 00:35:52 -06:00
Philip Reames	99f5417346	Sink routine for replacing a operand bundle to CallBase [NFC] We had equivalent code for both CallInst and InvokeInst, but never cared about the result type.	2021-03-03 12:07:55 -08:00
Fangrui Song	a84f4fc0df	[InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF `__llvm_prf_vnodes` and `__llvm_prf_names` are used by runtime but not referenced via relocation in the translation unit. With `-z start-stop-gc` (LLD 13 (D96914); GNU ld 2.37 https://sourceware.org/bugzilla/show_bug.cgi?id=27451), the linker does not let `__start_/__stop_` references retain their sections. Place `__llvm_prf_vnodes` and `__llvm_prf_names` in `llvm.used` to make them retained by the linker. This patch changes most existing `UsedVars` cases to `CompilerUsedVars` to reflect the ideal state - if the binary format properly supports section based GC (dead stripping), `llvm.compiler.used` should be sufficient. `__llvm_prf_vnodes` and `__llvm_prf_names` are switched to `UsedVars` since we want them to be unconditionally retained by both compiler and linker. Behaviors on COFF/Mach-O are not affected. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D97649	2021-03-03 11:32:24 -08:00
Philip Reames	e6e5ef40cb	[basicaa] Fix a latent bug in isGEPBaseAtNegativeOffset This was pointed out in review of D97520 by Nikita, but existed in the original code as well. The basic issue is that a decomposed GEP expression describes (potentially) more than one getelementptr. The "inbounds" derived UB which justifies this aliasing rule requires that the entire offset be composed of "inbounds" geps. Otherwise, as can be seen in the recently added and changes in this patch test, we can end up with a large commulative offset with only a small sub-offset actually being "inbounds". If that small sub-offset lies within the object, the result was unsound. We could potentially be fancier here, but for the moment, simply be conservative when any of the GEPs parsed aren't inbounds.	2021-03-03 08:43:32 -08:00
Philip Reames	dd9922c487	[basicaa] Minor indentation fix	2021-03-03 08:33:19 -08:00
Arnold Schwaighofer	a42bea211a	[coro async] Allow a coro.suspend.async to specify which argument is the context argument Before we used the same argument as the entry point. The resume partial function might want to use a different ABI for its context argument Differential Revision: https://reviews.llvm.org/D97333	2021-03-03 08:27:37 -08:00
Nico Weber	64f5d7e972	Revert "[InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF" This reverts commit `04c3040f41`. Breaks instrprof-value-merge.c in bootstrap builds.	2021-03-03 10:21:17 -05:00
Hans Wennborg	0a5dd06718	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit `ed4718eccb`.	2021-03-03 15:51:40 +01:00
Hans Wennborg	ddf43e5130	revert llvm/include/llvm/Analysis/ObjCARCUtil.h part of `1cc558bd4f`	2021-03-03 15:50:02 +01:00
Matt Arsenault	78dcff4841	GlobalISel: Add default implementation of assignValueToReg Refactor insertion of the asserting ops. This enables using them for AMDGPU. This code should essentially be the same for every target. Mips, X86 and ARM all have different code there now, but this seems to be an accident. The assignment functions are called with different types than they would be in the DAG, so this is all likely an assortment of hacks to get around that.	2021-03-03 09:29:53 -05:00
Piotr Sobczak	4672bac177	[AMDGPU] Introduce Strict WQM mode * Add amdgcn_strict_wqm intrinsic. * Add a corresponding STRICT_WQM machine instruction. * The semantic is similar to amdgcn_strict_wwm with a notable difference that not all threads will be forcibly enabled during the computations of the intrinsic's argument, but only all threads in quads that have at least one thread active. * The difference between amdgc_wqm and amdgcn_strict_wqm, is that in the strict mode an inactive lane will always be enabled irrespective of control flow decisions. Reviewed By: critson Differential Revision: https://reviews.llvm.org/D96258	2021-03-03 14:19:16 +01:00
Piotr Sobczak	c3ce7bae80	[AMDGPU] Rename amdgcn_wwm to amdgcn_strict_wwm * Introduce the new intrinsic amdgcn_strict_wwm * Deprecate the old intrinsic amdgcn_wwm The change is done for consistency as the "strict" prefix will become an important, distinguishing factor between amdgcn_wqm and amdgcn_strictwqm in the future. The "strict" prefix indicates that inactive lanes do not take part in control flow, specifically an inactive lane enabled by a strict mode will always be enabled irrespective of control flow decisions. The amdgcn_wwm will be removed, but doing so in two steps gives users time to switch to the new name at their own pace. Reviewed By: critson Differential Revision: https://reviews.llvm.org/D96257	2021-03-03 09:33:57 +01:00
Carl Ritson	2ddac69f98	[AMDGPU] Rename llvm.amdgcn.msaa.load to llvm.amdgcn.msaa.load.x While the underlying instruction is called image_msaa_load, the resource must be x component only. Rename the intrinsic for clarity. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D97829	2021-03-03 17:30:39 +09:00
Victor Huang	1756b2adc9	[AIX][TLS] Generate TLS variables in assembly files This patch allows generating TLS variables in assembly files on AIX. Initialized and external uninitialized variables are generated with the .csect pseudo-op and local uninitialized variables are generated with the .comm/.lcomm pseudo-ops. The patch also adds a check to explicitly say that TLS is not yet supported on AIX. Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile Originally patched by: bsaleil Commandeered by: NeHuang Differential Revision: https://reviews.llvm.org/D96184	2021-03-02 18:22:48 -06:00
Arthur Eubanks	99f1e86cbb	[opt] Error if -debug-pass is specified alongside the new PM Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D97810	2021-03-02 15:59:28 -08:00
Nikita Popov	29034f3876	[AST] Remove unused Loop member (NFC) To fix some build bots after D89264.	2021-03-02 23:11:51 +01:00
Nikita Popov	3d8f842712	[LICM] Make promotion faster Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite the existing access count limit. This patch optimizes the identification of promotable stores. The idea here is pretty simple: We're only interested in must-alias mod sets of loop invariant pointers. As such, only populate the AST with loop-invariant loads and stores (anything else is definitely not promotable) and then discard any sets which alias with any of the remaining, definitely non-promotable accesses. If we promoted something, check whether this has made some other accesses loop invariant and thus possible promotion candidates. This is much faster in practice, because we need to perform AA queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable) instead of O(NumTotal^2), and NumPromotable tends to be small. Additionally, promotable accesses have loop invariant pointers, for which AA is cheaper. This has a signicant positive compile-time impact. We save ~1.8% geomean on CTMark at O3, with 6% on lencod in particular and 25% on individual files. Conceptually, this change is NFC, but may not be so in practice, because the AST is only an approximation, and can produce different results depending on the order in which accesses are added. However, there is at least no impact on the number of promotions (licm.NumPromoted) in test-suite O3 configuration with this change. Differential Revision: https://reviews.llvm.org/D89264	2021-03-02 22:10:48 +01:00
Amara Emerson	8a316045ed	[AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass. Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize. Differential Revision: https://reviews.llvm.org/D97732	2021-03-02 12:55:51 -08:00
Sanjay Patel	415c67ba4c	[SDAG] allow partial undef vector constants with select->logic folds This is an enhancement suggested in the original review/commit: D97730 / `7fce3322a2`	2021-03-02 14:29:15 -05:00
Vy Nguyen	9a2e2de15f	[lld-macho] Change loadReexport to handle the case where a TAPI re-exports to reference documents nested within other TBD. Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature. The concrete use case is `System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports /System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree. This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it. The right thing should be: - When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD. - When loading from an actual dylib, no additional TBD documents need to be examined. - In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file Differential Revision: https://reviews.llvm.org/D97438	2021-03-02 12:14:31 -05:00
Krzysztof Parzyszek	d96b5e606a	[TableGen] Add IntrNoMerge as intrinsic property There is a function attribute 'nomerge' in addition to 'noduplicate' and 'convergent'. Both 'noduplicate' and 'convergent' have corresponding intrinsic properties. This patch adds an intrinsic property for the 'nomerge' attribute. Differential Revision: https://reviews.llvm.org/D96364	2021-03-02 09:04:50 -08:00
dfukalov	6e967834b9	[AA] Cache (optionally) estimated PartialAlias offsets. For the cases of two clobbering loads and one loaded object is fully contained in the second `BasicAAResult::aliasGEP` returns just `PartialAlias` that is actually more common case of partial overlap, it doesn't say anything about actual overlapping sizes. AA users such as GVN and DSE have no functionality to estimate aliasing of GEPs with non-constant offsets. The change stores estimated relative offsets so they can be used further. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93529	2021-03-02 19:04:15 +03:00
Tim Northover	888c5c24ca	AArch64: report fp16 arithmetic is present for apple-a11 CPU. AArch64.td got it right, but the target-parser dropped it, leading to missing feature flags in Clang.	2021-03-02 15:07:18 +00:00
Stefan Gränitz	ef2389235c	[Orc] Add JITLink debug support plugin for ELF x86-64 Add a new ObjectLinkingLayer plugin `DebugObjectManagerPlugin` and infrastructure to handle creation of `DebugObject`s as well as their registration in OrcTargetProcess. The current implementation only covers ELF on x86-64, but the infrastructure is not limited to that. The journey starts with a new `LinkGraph` / `JITLinkContext` pair being created for a `MaterializationResponsibility` in ORC's `ObjectLinkingLayer`. It sends a `notifyMaterializing()` notification, which is forwarded to all registered plugins. The `DebugObjectManagerPlugin` aims to create a `DebugObject` form the provided target triple and object buffer. (Future implementations might create `DebugObject`s from a `LinkGraph` in other ways.) On success it will track it as the pending `DebugObject` for the `MaterializationResponsibility`. This patch only implements the `ELFDebugObject` for `x86-64` targets. It follows the RuntimeDyld approach for debug object setup: it captures a copy of the input object, parses all section headers and prepares to patch their load-address fields with their final addresses in target memory. It instructs the plugin to report the section load-addresses once they are available. The plugin overrides `modifyPassConfig()` and installs a JITLink post-allocation pass to capture them. Once JITLink emitted the finalized executable, the plugin emits and registers the `DebugObject`. For emission it requests a new `JITLinkMemoryManager::Allocation` with a single read-only segment, copies the object with patched section load-addresses over to working memory and triggers finalization to target memory. For registration, it notifies the `DebugObjectRegistrar` provided in the constructor and stores the previously pending`DebugObject` as registered for the corresponding MaterializationResponsibility. The `DebugObjectRegistrar` registers the `DebugObject` with the target process. `llvm-jitlink` uses the `TPCDebugObjectRegistrar`, which calls `llvm_orc_registerJITLoaderGDBWrapper()` in the target process via `TargetProcessControl` to emit a `jit_code_entry` compatible with the GDB JIT interface [1]. So far the implementation only supports registration and no removal. It appears to me that it wouldn't raise any new design questions, so I left this as an addition for the near future. [1] https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97335	2021-03-02 15:07:35 +01:00
Stefan Gränitz	48c2acff0c	[JITLink] LinkGraph::getName() can be const	2021-03-02 15:07:34 +01:00
Jan Svoboda	4545813b17	[clang][cli] NFC: Rename marshalling multiclass The new name drops `String` from `MarshallingInfoStringInt`, which follows the naming convention of other marshalling multiclasses.	2021-03-02 11:53:40 +01:00
Yuanfang Chen	5de2d189e6	[Diagnose] Unify MCContext and LLVMContext diagnosing The situation with inline asm/MC error reporting is kind of messy at the moment. The errors from MC layout are not reliably propagated and users have to specify an inlineasm handler separately to get inlineasm diagnose. The latter issue is not a correctness issue but could be improved. * Kill LLVMContext inlineasm diagnose handler and migrate it to use DiagnoseInfo/DiagnoseHandler. * Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This covers use cases like inlineasm, MC, and any clients using SourceMgr. * Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all use cases, only one of them is used. * If LLVMContext is available, let MCContext uses LLVMContext's diagnose handler; if LLVMContext is not available, MCContext uses its own default diagnose handler which just prints SMDiagnostic. * Change a few clients(Clang, llc, lldb) to use the new way of reporting. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D97449	2021-03-01 15:58:37 -08:00
Matt Arsenault	0131498402	GlobalISel: Remove dead code Generic code should probably not introduce G_INSERT/G_EXTRACT. The mirror unpackRegs should also be removed, but AMDGPU still has a use remaining which needs to be fixed.	2021-03-01 17:06:43 -05:00
Fangrui Song	04c3040f41	[InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF `__llvm_prf_vnodes` and `__llvm_prf_names` are used by runtime but not referenced via relocation in the translation unit. With `-z start-stop-gc` (D96914 https://sourceware.org/bugzilla/show_bug.cgi?id=27451), the linker no longer lets `__start_/__stop_` references retain them. Place `__llvm_prf_vnodes` and `__llvm_prf_names` in `llvm.used` to make them retained by the linker. This patch changes most existing `UsedVars` cases to `CompilerUsedVars` to reflect the ideal state - if the binary format properly supports section based GC (dead stripping), `llvm.compiler.used` should be sufficient. `__llvm_prf_vnodes` and `__llvm_prf_names` are switched to `UsedVars` since we want them to be unconditionally retained by both compiler and linker. Behaviors on other COFF/Mach-O are not affected. Differential Revision: https://reviews.llvm.org/D97649	2021-03-01 13:43:23 -08:00
Nicolas Guillemot	6fb6bdff37	Fix the value_type of defusechain_iterator to match its operator() defusechain_iterator has an operator() and operator->() that return references to a MachineOperand, but its "reference" and "pointer" typedefs are set as if the iterator returns a MachineInstr reference. This causes compilation errors when defusechain_iterator is used in generic code that uses the "reference" and "pointer" typedefs. This patch fixes this by updating the typedefs to use MachineOperand instead of MachineInstr. Reviewed By: mkitzan Differential Revision: https://reviews.llvm.org/D97522	2021-03-01 10:41:10 -08:00
Arthur Eubanks	040c1b49d7	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Jay Foad	216dee9170	[AMDGPU] Add IntrWillReturn to recently added intrinsics This adds IntrWillReturn to the gfx90a mfma intrinsics, to match all the other mfma intrinsics, and llvm.amdgcn.live.mask, to match llvm.amdgcn.ps.live. Differential Revision: https://reviews.llvm.org/D97675	2021-03-01 17:35:26 +00:00
Juneyoung Lee	c89d9d8a48	[TTI] Consider select form of and/or i1 as having arithmetic cost This is a patch that updates the cost of `select i1 a, b, false` to be equivalent to that of `and i1 a, b` as well as the cost of `select i1 a, true, b` equivalent to `or i1 a, b`. Until now, these selects were folded into and/or i1 by InstCombine, but the transformation is poison-unsafe. This is a step towards removing the unsafe transformation. D93065 has relevant transformations linked. These selects should be translated into the assemblies as and/or i1 do in the same manner. The cost should be equivalent. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D97360	2021-03-02 02:18:19 +09:00
Andy Wingo	2632ba6a35	[WebAssembly] call_indirect issues table number relocs If the reference-types feature is enabled, call_indirect will explicitly reference its corresponding function table via TABLE_NUMBER relocations against a table symbol. Also, as before, address-taken functions can also cause the function table to be created, only with reference-types they additionally cause a symbol table entry to be emitted. Differential Revision: https://reviews.llvm.org/D90948	2021-03-01 16:49:00 +01:00
Masoud Ataei	5fe0cab79e	[PowerPC] Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV Under -O3 and -Ofast, the MASSV conversion prevents the sqrt call to be inlined. Inline sqrt is faster than MASSV call on leppc. Differential Revision: https://reviews.llvm.org/D97487	2021-03-01 15:42:19 +00:00
Jay Foad	796a60d2ea	[AMDGPU] New intrinsic void llvm.amdgcn.s.sethalt(i32) The expected use case is for frontends to insert this into shaders that are to be run under a debugger. The shader can then be resumed or single stepped from the point of the call under debugger control. Differential Revision: https://reviews.llvm.org/D97670	2021-03-01 14:30:23 +00:00
Matt Arsenault	6c260d3bc0	GlobalISel: Move splitToValueTypes to generic code I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication. Mips and X86 have their own more exotic versions which should be removed. However replacing those is better left for a separate patch since it requires other changes to avoid regressions.	2021-03-01 08:58:18 -05:00
Florian Hahn	53dacb7b67	[LV] Generate RT checks up-front and remove them if required. This patch updates LV to generate the runtime checks just after cost modeling, to allow a more precise estimate of the actual cost of the checks. This information will be used in future patches to generate larger runtime checks in cases where the checks only make up a small fraction of the expected scalar loop execution time. The runtime checks are created up-front in a temporary block to allow better estimating the cost and un-linked from the existing IR. After deciding to vectorize, the checks are moved backed. If deciding not to vectorize, the temporary block is completely removed. This patch is similar in spirit to D71053, but explores a different direction: instead of delaying the decision on whether to vectorize in the presence of runtime checks it instead optimistically creates the runtime checks early and discards them later if decided to not vectorize. This has the advantage that the cost-modeling decisions can be kept together and can be done up-front and thus preserving the general code structure. I think delaying (part) of the decision to vectorize would also make the VPlan migration a bit harder. One potential drawback of this patch is that we speculatively generate IR which we might have to clean up later. However it seems like the code required to do so is quite manageable. Reviewed By: lebedev.ri, ebrevnov Differential Revision: https://reviews.llvm.org/D75980	2021-03-01 10:48:04 +00:00
Fraser Cormack	6718fda6ad	[CodeGen] Fix issues with subvector intrinsic index types This patch addresses issues arising from the fact that the index type used for subvector insertion/extraction is inconsistent between the intrinsics and SDNodes. The intrinsic forms require i64 whereas the SDNodes use the type returned by SelectionDAG::getVectorIdxTy. Rather than update the intrinsic definitions to use an overloaded index type, this patch fixes the issue by transforming the index to the correct type as required. Any loss of index bits going from i64 to a smaller type is unexpected, and will be caught by an assertion in SelectionDAG::getVectorIdxConstant. The patch also updates the documentation for INSERT_SUBVECTOR and adds an assertion to its creation to bring it in line with EXTRACT_SUBVECTOR. This necessitated changes to AArch64 which was using i64 for EXTRACT_SUBVECTOR but i32 for INSERT_SUBVECTOR. Only one test changed its codegen after updating the backend accordingly. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D97459	2021-03-01 10:28:21 +00:00
Juneyoung Lee	5419b67137	[SimplifyCFG] Update FoldTwoEntryPHINode to handle and/or of select and binop equally This is a minor change that fixes FoldTwoEntryPHINode to handle phis with and/ors of select form and binop form equally.	2021-03-01 13:34:51 +09:00
Kazu Hirata	b4bed1cb24	[IR] Use range-based for loops (NFC)	2021-02-28 10:59:23 -08:00
Kazu Hirata	d639120983	[llvm] Use set_is_subset (NFC)	2021-02-28 10:59:20 -08:00
William S. Moses	b077d82b00	[Attributor] Conditinoally delete fns Allow the attributor to delete functions only if requested Differential Revision: https://reviews.llvm.org/D97238	2021-02-27 20:37:42 -05:00
Heejin Ahn	aa097ef8d4	[WebAssembly] Fix reverse mapping in WasmEHFuncInfo D97247 added the reverse mapping from unwind destination to their source, but it had a critical bug; sources can be multiple, because multiple BBs can have a single BB as their unwind destination. This changes `WasmEHFuncInfo::getUnwindSrc` to `getUnwindSrcs` and makes it return a vector rather than a single BB. It does not return the const reference to the existing vector but creates a new vector because `WasmEHFuncInfo` stores not `BasicBlock` or `MachineBasicBlock` but `PointerUnion` of them. Also I hoped to unify those methods for `BasicBlock` and `MachineBasicBlock` into one using templates to reduce duplication, but failed because various usages require `BasicBlock*` to be `const` but it's hard to make it `const` for `MachineBasicBlock` usages. Fixes https://github.com/emscripten-core/emscripten/issues/13514. (More precisely, fixes https://github.com/emscripten-core/emscripten/issues/13514#issuecomment-784708744) Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D97583	2021-02-26 17:12:10 -08:00
Fangrui Song	47c5576d7d	ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects If a global object is listed in `@llvm.used`, place it in a unique section with the `SHF_GNU_RETAIN` flag. The section is a GC root under `ld --gc-sections` with LLD>=13 or GNU ld>=2.36. For front ends which do not expect to see multiple sections of the same name, consider emitting `@llvm.compiler.used` instead of `@llvm.used`. SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in binutils. We don't do the restriction - see the rationale in D95749. The integrated assembler has supported SHF_GNU_RETAIN since D95730. GNU as>=2.36 supports section flag 'R'. We don't need to worry about GNU ld support because older GNU ld just ignores the unknown SHF_GNU_RETAIN. With this change, `__attribute__((retain))` functions/variables emitted by clang will get the SHF_GNU_RETAIN flag. Differential Revision: https://reviews.llvm.org/D97448	2021-02-26 16:38:44 -08:00
Philip Reames	f2cfef3596	Be more mathematicly precise about definition of recurrence [NFC] This clarifies the interface of the matchSimpleRecurrence helper introduced in `8020be0b8` for non-commutative operators. After `ebd3aeba`, I realized the original way I framed the routine was inconsistent. For shifts, we only matched the the LHS form, but for sub we matched both and the caller wanted that information. So, instead, we now consistently match both forms for non-commutative operators and the caller becomes responsible for filtering if needed. I tried to put a clear warning in the header because I suspect the RHS form of e.g. a sub recurrence is non-obvious for most folks. (It was for me.)	2021-02-26 11:22:01 -08:00
Philip Reames	ebd3aeba27	Use helper introduced in `8020be0b8` to simplify ValueTracking [NFC] Direct rewrite of the code the helper was extracted from.	2021-02-26 10:47:26 -08:00
Philip Reames	8020be0b8b	Add a helper for matching simple recurrence cycles This helper came up in another review, and I've got about 4 different patches with copies of this copied into it. Time to precommit the routine. :)	2021-02-26 10:21:23 -08:00
Mircea Trofin	a2bfc43ae1	[NFC] Const-ed 2 APIs in VirtRegMap	2021-02-26 09:32:42 -08:00
Mircea Trofin	a00f7dc2d5	[NFC] MCRegister fixes in RegisterClassInfo, and const-ed APIs	2021-02-26 08:53:57 -08:00
Vladislav Vinogradov	9909237d99	[ADT][NFC] Add extra typedefs to `ArrayRef` and `MutableArrayRef` * `value_type` * `pointer` * `const_pointer` * `reference` * `const_reference` * `const_reverse_iterator` * `size_type` * `difference_type` It makes `ArrayRef` and `MutableArrayRef` types fully compliant with STL Container concept. Reviewed By: lattner, courbet Differential Revision: https://reviews.llvm.org/D95611	2021-02-26 18:37:08 +03:00
Evgeniy Brevnov	13a5cac2ba	Revert "[NARY-REASSOCIATE] Support reassociation of min/max" This reverts commit `83d134c3c4`.	2021-02-26 19:47:54 +07:00
Stefan Gränitz	406ef36b03	[Orc] Use extensible RTTI for the orc::ObjectLayer class hierarchy So far we had no way to distinguish between JITLink and RuntimeDyld in lli. Instead, we used implicit knowledge that RuntimeDyld would be used for linking ELF. In order to get D97337 to work with lli though, we have to move on and allow JITLink for ELF. This patch uses extensible RTTI to allow external clients to add their own layers without touching the LLVM sources. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97338	2021-02-26 13:13:05 +01:00
Chen Zheng	d39bc36b1b	[debug-info] refactor emitDwarfUnitLength remove `Hi` `Lo` argument from `emitDwarfUnitLength`, so we can make caller of emitDwarfUnitLength easier. Reviewed By: MaskRay, dblaikie, ikudrin Differential Revision: https://reviews.llvm.org/D96409	2021-02-25 21:00:25 -05:00
James Y Knight	24539f1ef2	Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg. And then push those change throughout LLVM. Keep the old signature in Clang's CGBuilder for now -- that will be updated in a follow-on patch (D97224). The MLIR LLVM-IR dialect is not updated to support the new alignment attribute, but preserves its existing behavior. Differential Revision: https://reviews.llvm.org/D97223	2021-02-25 18:29:42 -05:00
Francis Visoiu Mistrih	fee9abe69c	[Remarks] Provide more information about auto-init calls This now analyzes calls to both intrinsics and functions. For intrinsics, grab the ones we know and care about (mem* family) and analyze the arguments. For calls, use TLI to get more information about the libcalls, then analyze the arguments if known. ``` auto-init.c:4:7: remark: Call to memset inserted by -ftrivial-auto-var-init. Memory operation size: 4096 bytes. [-Rpass-missed=annotation-remarks] int var[1024]; ^ ``` Differential Revision: https://reviews.llvm.org/D97489	2021-02-25 15:14:09 -08:00
Francis Visoiu Mistrih	4753a69a31	[Remarks] Provide more information about auto-init stores This adds support for analyzing the instruction with the !annotation "auto-init" in order to generate a more user-friendly remark. For now, support the store size, and whether it's atomic/volatile. Example: ``` auto-init.c:4:7: remark: Store inserted by -ftrivial-auto-var-init.Store size: 4 bytes. [-Rpass-missed=annotation-remarks] int var; ^ ``` Differential Revision: https://reviews.llvm.org/D97412	2021-02-25 15:14:09 -08:00
Adrian Prantl	00b3f2f310	Add more historic DWARF vendor extensions The maintainer of libdwarf kindly provided this patch with a bunch of historic DWARF extensions that are missing from Dwarf.def. This list is helpful to avoid potential conflicts in the user-defined vendor extension space in the future. Patch by David Anderson! [Relanded with an updated test.] Differential Revision: https://reviews.llvm.org/D97242	2021-02-25 15:09:42 -08:00
Richard Smith	95d0d8e9e9	Fix constructor declarations that are invalid in C++20 onwards. Under C++ CWG DR 2237, the constructor for a class template C must be written as 'C(...)' not as 'C<T>(...)'. This fixes a build failure with GCC in C++20 mode. In passing, remove some other redundant '<T>' qualification from the affected classes.	2021-02-25 14:25:01 -08:00
Fangrui Song	4d63892acb	[SanitizerCoverage] Drop !associated on metadata sections In SanitizerCoverage, the metadata sections (`__sancov_guards`, `__sancov_cntrs`, `__sancov_bools`) are referenced by functions. After inlining, such a `__sancov_*` section can be referenced by more than one functions, but its sh_link still refers to the original function's section. (Note: a SHF_LINK_ORDER section referenced by a section other than its linked-to section violates the invariant.) If the original function's section is discarded (e.g. LTO internalization + `ld.lld --gc-sections`), ld.lld may report a `sh_link points to discarded section` error. This above reasoning means that `!associated` is not appropriate to be called by an inlinable function. Non-interposable functions are inline candidates, so we have to drop `!associated`. A `__sancov_pcs` is not referenced by other sections but is expected to parallel a metadata section, so we have to make sure the two sections are retained or discarded at the same time. A section group does the trick. (Note: we have a module ctor, so `getUniqueModuleId` guarantees to return a non-empty string, and `GetOrCreateFunctionComdat` guarantees to return non-null.) For interposable functions, we could keep using `!associated`, but LTO can change the linkage to `internal` and allow such functions to be inlinable, so we have to drop `!associated`, too. To not interfere with section group resolution, we need to use the `noduplicates` variant (section group flag 0). (This allows us to get rid of the ModuleID parameter.) In -fno-pie and -fpie code (mostly dso_local), instrumented interposable functions have WeakAny/LinkOnceAny linkages, which are rare. So the section group header overload should be low. This patch does not change the object file output for COFF (where `!associated` is ignored). Reviewed By: morehouse, rnk, vitalybuka Differential Revision: https://reviews.llvm.org/D97430	2021-02-25 11:59:23 -08:00
Stanislav Mekhanoshin	d9c99043bd	Option to ignore llvm[.compiler].used uses in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96087	2021-02-25 10:06:24 -08:00
Stanislav Mekhanoshin	29e2d9461a	Option to ignore assume like intrinsic uses in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96081	2021-02-25 09:48:29 -08:00
Jon Roelofs	7f6e331645	Support `#pragma clang section` directives on MachO targets rdar://59560986 Differential Revision: https://reviews.llvm.org/D97233	2021-02-25 09:30:10 -08:00
Rong Xu	6103b6ad69	[SampleFDO][NFC] Refactor: make SampleProfileLoaderBaseImpl a template class This patch makes SampleProfileLoaderBaseImpl a template class so it can be used in CodeGen transformation. Noticeable changes: * use one template parameter and use IRTraits to get other used types an type specific functions. * remove the temporary "inline" keywords in previous refactor patch. * change the template function findEquivalencesFor to a regular function. This function has a single caller with type of PostDominatorTree. It's simpler to use the type directly because MachinePostDominatorTree is not a derived type of template DominatorTreeBase. Differential Revision: https://reviews.llvm.org/D96981	2021-02-25 08:26:17 -08:00
Fraser Cormack	b368fc735d	[CodeGen] Format code comment to 80 columns. NFC.	2021-02-25 15:55:21 +00:00
Jan Svoboda	43cac1d27d	[clang][cli] NFC: Remove ArgList infrastructure for recording queries This patch removes the infrastructure for recording queries in `ArgList`, partially reverting D94472. The infrastructure was used during command line round-trip to determine which arguments should a certain subset of `CompilerInvocation` generate. Since D96280, the command line arguments are being generated all at once, making this code no longer necessary. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D96325	2021-02-25 13:53:24 +01:00
Evgeniy Brevnov	83d134c3c4	[NARY-REASSOCIATE] Support reassociation of min/max Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88287	2021-02-25 18:22:39 +07:00
Jan Svoboda	88e45f00c1	[clang][cli] Add MarshallingInfoEnum multiclass This patch introduces a tablegen multiclass called `MarshallingInfoEnum`. It has the same semantics as `MarshallingInfoString` had in combination with `AutoNormalizeEnum`, but it's easier to use and follows the convention used for other `MarshallingInfoXxx` multiclasses. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D97375	2021-02-25 08:47:18 +01:00
Liu, Chen3	4bc7c8631a	[X86] Support amx-bf16 intrinsic. Adding support for intrinsics of AMX-BF16. This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong predicate. Differential Revision: https://reviews.llvm.org/D97358	2021-02-25 09:06:48 +08:00
Greg McGary	151990dd94	[lld-macho] add code signature for native arm64 macOS Differential Revision: https://reviews.llvm.org/D96164	2021-02-24 17:05:23 -08:00
Duncan P. N. Exon Smith	01701646d5	Transforms: Clone distinct nodes in metadata mapper unless RF_ReuseAndMutateDistinctMDs This is a follow up to `22a52dfddc` and a revert of `df763188c9`. With this change, we only skip cloning distinct nodes in MDNodeMapper::mapDistinct if RF_ReuseAndMutateDistinctMDs, dropping the no-longer-needed local helper `cloneOrBuildODR()`. Skipping cloning in other cases is unsound and breaks CloneModule, which is why the textual IR for PR48841 didn't pass previously. This commit adds the test as: Transforms/ThinLTOBitcodeWriter/cfi-debug-info-cloned-type-references-global-value.ll Cloning less often exposed a hole in subprogram cloning in CloneFunctionInto thanks to df763188c9a1ecb1e7e5c4d4ea53a99fbb755903's test ThinLTO/X86/Inputs/dicompositetype-unique-alias.ll. If a function has a subprogram attachment whose scope is a DICompositeType that shouldn't be cloned, but it has no internal debug info pointing at that type, that composite type was being cloned. This commit plugs that hole, calling DebugInfoFinder::processSubprogram from CloneFunctionInto. As hinted at in 22a52dfddcefad4f275eb8ad1cc0e200074c2d8a's commit message, I think we need to formalize ownership of metadata a bit more so that ValueMapper/CloneFunctionInto (and similar functions) can deal with cloning (or not) metadata in a more generic, less fragile way. This fixes PR48841. Differential Revision: https://reviews.llvm.org/D96734	2021-02-24 12:57:52 -08:00
Duncan P. N. Exon Smith	1e1b92f76d	IR: Rename Metadata::ImplicitCode to SubclassData1, NFC Metadata::ImplicitCode is a bit shaved off of Metadata::Storage, currently only in use by the subclass DILocation. However, the bit isn't reserved for that purpose. Rename it `SubclassData1` to make it clear that it has nothing to do with Metadata itself (and other subclasses are free to use it). As a drive-by, remove an old TODO about exposing bits to subclasses (looks like that has mostly been done). No functionality change here. Differential Revision: https://reviews.llvm.org/D96740	2021-02-24 12:56:26 -08:00
James Y Knight	c2487bf7df	Remove a workaround for MSVC 2013, now that MSVC 2017 is the minimum. In MSVC 2013, 'alignas(integer-template-arg)' didn't compile; verified on godbolt that this now works properly.	2021-02-24 13:56:49 -05:00
Lang Hames	8380d07e39	[JITLink] Add assertions, fix a comment. The new assertions check that Addressables removed when removing external or absolute symbols are not referenced by another symbol. A comment on post-fixup passes is updated: vmaddrs have all been set up by the time the pre-fixup passes are run, post-fixup passes run after fixups have been applied to content.	2021-02-24 21:02:37 +11:00
Dan Liew	7d3ef103b5	[ASan] Introduce a way set different ways of emitting module destructors. Previously there was no way to control how module destructors were emitted by `ModuleAddressSanitizerPass`. However, we want language frontends (e.g. Clang) to be able to decide how to emit these destructors (if at all). This patch introduces the `AsanDtorKind` enum that represents the different ways destructors can be emitted. There are currently only two valid ways to emit destructors. * `Global` - Use `llvm.global_dtors`. This was the previous behavior and is the default. * `None` - Do not emit module destructors. The `ModuleAddressSanitizerPass` and the various wrappers around it have been updated to take the `AsanDtorKind` as an argument. The `-asan-destructor-kind=` command line argument has been introduced to make this easy to test from `opt`. If this argument is specified it overrides the value passed to the `ModuleAddressSanitizerPass` constructor. Note that `AsanDtorKind` is not `bool` because we will introduce a new way to emit destructors in a subsequent patch. Note that `AsanDtorKind` is given its own header file because if it is declared in `Transforms/Instrumentation/AddressSanitizer.h` it leads to compile error (Module is ambiguous) when trying to use it in `clang/Basic/CodeGenOptions.def`. rdar://71609176 Differential Revision: https://reviews.llvm.org/D96571	2021-02-23 20:01:21 -08:00
Nico Weber	f14a14dd25	Revert "Add more historic DWARF vendor extensions" This reverts commit `c4a9144468`. Breaks check-llvm everywhere, see https://reviews.llvm.org/D97242#2583716	2021-02-23 22:10:02 -05:00
Chen Zheng	be5d92e37e	[Debug-Info][NFC] move emitDwarfUnitLength to MCStreamer class We may need to do some customization for DWARF unit length in DWARF section headers for some targets for some code generation path. For example, for XCOFF in assembly path, AIX assembler does not require the debug section containing its debug unit length in the header. Move emitDwarfUnitLength to MCStreamer class so that we can do customization in different Streamers Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D95932	2021-02-23 21:29:05 -05:00
Adrian Prantl	c4a9144468	Add more historic DWARF vendor extensions The maintainer of libdwarf kindly provided this patch with a bunch of historic DWARF extensions that are missing from Dwarf.def. This list is helpful to avoid potential conflicts in the user-defined vendor extension space in the future. Patch by David Anderson! Differential Revision: https://reviews.llvm.org/D97242	2021-02-23 17:54:04 -08:00
Juneyoung Lee	56d228a14e	[SimplifyCFG] Update passingValueIsAlwaysUndefined to check more attributes This is a simple patch to update SimplifyCFG's passingValueIsAlwaysUndefined to inspect more attributes. A new function `CallBase::isPassingUndefUB` checks attributes that imply noundef. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D97244	2021-02-24 10:40:50 +09:00
Erich Keane	af4451eb4f	[NFC] Make TrailingObjects non-copyable/non-movable This got me pretty recently... TrailingObjects cannot be copied or moved, since they need to be pre-allocated. This patch deletes the copy and move operations (plus re-adds the default ctor). Differential Revision: https://reviews.llvm.org/D97324	2021-02-23 16:30:13 -08:00
Fangrui Song	ef312951fd	collectUsedGlobalVariables: migrate SmallPtrSetImpl overload to SmallVecImpl overload after D97128 And delete the SmallPtrSetImpl overload. While here, decrease inline element counts from 8 to 4. See D97128 for the choice. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D97257	2021-02-23 16:09:06 -08:00
Fangrui Song	3adb89bb9f	[ThinLTO] Make cloneUsedGlobalVariables deterministic Iterating on `SmallPtrSet<GlobalValue *, 8>` with more than 8 elements is not deterministic. Use a SmallVector instead because `Used` is guaranteed to contain unique elements. While here, decrease inline element counts from 8 to 4. The number of `llvm.used`/`llvm.compiler.used` elements is usually 0 or 1. For full LTO/hybrid LTO, the number may be large, so we need to be careful. According to tejohnson's analysis https://reviews.llvm.org/D97128#2582399 , 4 is good for a large project with WholeProgramDevirt, when available_externally vtables are placed in the llvm.compiler.used set. Differential Revision: https://reviews.llvm.org/D97128	2021-02-23 16:09:05 -08:00
Heejin Ahn	ea8c6375e3	[WebAssembly] Fix incorrect grouping and sorting of exceptions This CL is not big but contains changes that span multiple analyses and passes. This description is very long because it tries to explain basics on what each pass/analysis does and why we need this change on top of that. Please feel free to skip parts that are not necessary for your understanding. --- `WasmEHFuncInfo` contains the mapping of <EH pad, the EH pad's next unwind destination>. The value (unwind dest) here is where an exception should end up when it is not caught by the key (EH pad). We record this info in WasmEHPrepare to fix catch mismatches, because the CFG itself does not have this info. A CFG only contains BBs and predecessor-successor relationship between them, but in `WasmEHFuncInfo` the unwind destination BB is not necessarily a successor or the key EH pad BB. Their relationship can be intuitively explained by this C++ code snippet: ``` try { try { foo(); } catch (int) { // EH pad ... } } catch (...) { // unwind destination } ``` So when `foo()` throws, it goes to `catch (int)` first. But if it is not caught by it, it ends up in the next unwind destination `catch (...)`. This unwind destination is what you see in `catchswitch`'s `unwind label %bb` part. --- `WebAssemblyExceptionInfo` groups exceptions so that they can be sorted continuously together in CFGSort, as we do for loops. What this analysis does is very simple: it creates a single `WebAssemblyException` per EH pad, and all BBs that are dominated by that EH pad are included in this exception. We also identify subexception relationship in this way: if EHPad A domiantes EHPad B, EHPad B's exception is a subexception of EHPad A's exception. This simple rule turns out to be incorrect in some cases. In `WasmEHFuncInfo`, if EHPad A's unwind destination is EHPad B, it means semantically EHPad B should not be included in EHPad A's exception, because it does not make sense to rethrow/delegate to an inner scope. This is what happened in CFGStackify as a result of this: ``` try try catch ... <- %dest_bb is among here! end delegate %dest_bb ``` So this patch adds a phase in `WebAssemblyExceptionInfo::recalculate` to make sure excptions' unwind destinations are not subexceptions of their unwind sources in `WasmEHFuncInfo`. But this alone does not prevent `dest_bb` in the example above from being sorted within the inner `catch`'s exception, even if its exception is not a subexception of that `catch`'s exception anymore, because of how CFGSort works, which will be explained below. --- CFGSort places BBs within the same `SortRegion` (loop or exception) continuously together so they can be demarcated with `loop`-`end_loop` or `catch`-`end_try` in CFGStackify. `SortRegion` is a wrapper for one of `MachineLoop` or `WebAssemblyException`. `SortRegionInfo` already does some complicated things because there discrepancies between those two data structures. `WebAssemblyException` is what we control, and it is defined as an EH pad as its header and BBs dominated by the header as its BBs (with a newly added exception of unwind destinations explained in the previous paragraph). But `MachineLoop` is an LLVM data structure and uses the standard loop detection algorithm. So by the algorithm, BBs that are 1. dominated by the loop header and 2. have a path back to its header. Because of the second condition, many BBs that are dominated by the loop header are not included in the loop. So BBs that contain `return` or branches to outside of the loop are not technically included in `MachineLoop`, but they can be sorted together with the loop with no problem. Maybe to relax the condition, in CFGSort, when we are in a `SortRegion` we allow sorting of not only BBs that belong to the current innermost region but also BBs that are by the current region header. (This was written this way from the first version written by Dan, when only loops existed.) But now, we have cases in exceptions when EHPad B is the unwind destination for EHPad A, even if EHPad B is dominated by EHPad A it should not be included in EHPad A's exception, and should not be sorted within EHPad A. One way to make things work, at least correctly, is change `dominates` condition to `contains` condition for `SortRegion` when sorting BBs, but this will change compilation results for existing non-EH code and I can't be sure it will not degrade performance or code size. I think it will degrade performance because it will force many BBs dominated by a loop, which don't have the path back to the header, to be placed after the loop and it will likely to create more branches and blocks. So this does a little hacky check when adding BBs to `Preferred` list: (`Preferred` list is a ready list. CFGSort maintains ready list in two priority queues: `Preferred` and `Ready`. I'm not very sure why, but it was written that way from the beginning. BBs are first added to `Preferred` list and then some of them are pushed to `Ready` list, so here we only need to guard condition for `Preferred` list.) When adding a BB to `Preferred` list, we check if that BB is an unwind destination of another BB. To do this, this adds the reverse mapping, `UnwindDestToSrc`, and getter methods to `WasmEHFuncInfo`. And if the BB is an unwind destination, it checks if the current stack of regions (`Entries`) contains its source BB by traversing the stack backwards. If we find its unwind source in there, we add the BB to its `Deferred` list, to make sure that unwind destination BB is added to `Preferred` list only after that region with the unwind source BB is sorted and popped from the stack. --- This does not contain a new test that crashes because of this bug, but this fix changes the result for one of existing test case. This test case didn't crash because it fortunately didn't contain `delegate` to the incorrectly placed unwind destination BB. Fixes https://github.com/emscripten-core/emscripten/issues/13514. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D97247	2021-02-23 14:54:55 -08:00
Matthew Voss	6da7d31416	[llvm-profdata] Emit Error when Invalid MemOpSize Section is Created by llvm-profdata Under certain (currently unknown) conditions, llvm-profdata is outputting profiles that have two consecutive entries in the MemOPSize section for the value 0. This causes the PGOMemOPSizeOpt pass to output an invalid switch instruction with two cases for 0. As mentioned, we’re not quite sure what’s causing this to happen, but this patch prevents llvm-profdata from outputting a profile that has this problem and gives an error with a request for a reproducible. Differential Revision: https://reviews.llvm.org/D92074	2021-02-23 12:51:54 -08:00
Jay Foad	a6be26710b	[GlobalISel] Make more use of replaceSingleDefInstWithReg. NFC.	2021-02-23 17:08:34 +00:00
Juneyoung Lee	19c2e12947	[JumpThreading] Update computeValueKnownInPredecessors to recognize logical and/or patterns This allows JumpThreading's computeValueKnownInPredecessors to recognize select form of and/or patterns as well.	2021-02-24 00:06:10 +09:00
Nate Chandler	01b4890e47	Add @llvm.coro.async.size.replace intrinsic. The new intrinsic replaces the size in one specified AsyncFunctionPointer with the size in another. This ability is necessary for functions which merely forward to async functions such as those defined for partial applications. Reviewed By: aschwaighofer Differential Revision: https://reviews.llvm.org/D97229	2021-02-23 06:43:52 -08:00
David Green	dd2dbf7ee2	[TTI] Change getOperandsScalarizationOverhead to take Type args As a followup to D95291, getOperandsScalarizationOverhead was still using a VF as a vector factor if the arguments were scalar, and would assert on certain matrix intrinsics with differently sized vector arguments. This patch removes the VF arg, instead passing the Types through directly. This should allow it to more accurately compute the cost without having to guess at which operands will be vectorized, something difficult with more complex intrinsics. This adjusts one SVE test as it is now calling the wrong intrinsic vs veccall. Without invalid InstructCosts the cost of the scalarized intrinsic is too low. This should get fixed when the cost of scalarization is accounted for with scalable types. Differential Revision: https://reviews.llvm.org/D96287	2021-02-23 13:04:59 +00:00
David Green	bd4b61efbd	[CostModel] Remove VF from IntrinsicCostAttributes getIntrinsicInstrCost takes a IntrinsicCostAttributes holding various parameters of the intrinsic being costed. It can either be called with a scalar intrinsic (RetTy==Scalar, VF==1), with a vector instruction (RetTy==Vector, VF==1) or from the vectorizer with a scalar type and vector width (RetTy==Scalar, VF>1). A RetTy==Vector, VF>1 is considered an error. Both of the vector modes are expected to be treated the same, but because this is confusing many backends end up getting it wrong. Instead of trying work with those two values separately this removes the VF parameter, widening the RetTy/ArgTys by VF used called from the vectorizer. This keeps things simpler, but does require some other modifications to keep things consistent. Most backends look like this will be an improvement (or were not using getIntrinsicInstrCost). AMDGPU needed the most changes to keep the code from `c230965ccf` working. ARM removed the fix in `dfac521da1`, webassembly happens to get a fixup for an SLP cost issue and both X86 and AArch64 seem to now be using better costs from the vectorizer. Differential Revision: https://reviews.llvm.org/D95291	2021-02-23 13:03:26 +00:00
Alexey Lapshin	875b3b2cdd	[Support] Add reserve() method to the raw_ostream. If resulting size of the output stream is already known, then the space for stream data could be preliminary allocated in some cases. f.e. raw_string_ostream could preallocate the space for the target string(it allows to avoid reallocations during writing into the stream). Differential Revision: https://reviews.llvm.org/D91693	2021-02-23 14:06:38 +03:00
Andy Wingo	7dc98adbb0	Revert "[WebAssembly] call_indirect issues table number relocs" This reverts commit `861dbe1a02`. It broke emscripten -- see https://reviews.llvm.org/D90948#2578843.	2021-02-23 11:48:08 +01:00
Liu, Chen3	f8b9035aae	[X86] Support amx-int8 intrinsic. Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD. Differential Revision: https://reviews.llvm.org/D97259	2021-02-23 17:08:05 +08:00
Lang Hames	430817d0d5	[JITLink] Add a getFixupAddress convenience method to Block.	2021-02-23 11:08:54 +11:00
Lang Hames	adf2098bd8	[JITLink] Don't allow creation of sections with duplicate names.	2021-02-23 11:08:54 +11:00
Heejin Ahn	a08e609d2e	[WebAssembly] Rename methods in WasmEHFuncInfo (NFC) This renames variable and method names in `WasmEHFuncInfo` class to be simpler and clearer. For example, unwind destinations are EH pads by definition so it doesn't necessarily need to be included in every method name. Also I am planning to add the reverse mapping in a later CL, something like `UnwindDestToSrc`, so this renaming will make meanings clearer. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D97173	2021-02-22 12:16:11 -08:00
Leonard Chan	1c932baeaa	[llvm][Bitcode] Add bitcode reader/writer for DSOLocalEquivalent This is necessary for compilation with [thin]lto. Differential Revision: https://reviews.llvm.org/D96170	2021-02-22 10:37:57 -08:00
Nikita Popov	5e7e499b91	[JumpThreading] Clone noalias.scope.decl when threading blocks When cloning instructions during jump threading, also clone and adapt any declared scopes. This is primarily important when threading loop exits, because we'll end up with two dominating scope declarations in that case (at least after additional loop rotation). This addresses a loose thread from https://reviews.llvm.org/rG2556b413a7b8#975012. Differential Revision: https://reviews.llvm.org/D97154	2021-02-22 18:35:30 +01:00
Ryan Santhiraraja	2c25efcbd3	[AArch64] Adding SHA3 Intrinsics support This patch adds the following SHA3 Intrinsics: vsha512hq_u64, vsha512h2q_u64, vsha512su0q_u64, vsha512su1q_u64 veor3q_u8 veor3q_u16 veor3q_u32 veor3q_u64 veor3q_s8 veor3q_s16 veor3q_s32 veor3q_s64 vrax1q_u64 vxarq_u64 vbcaxq_u8 vbcaxq_u16 vbcaxq_u32 vbcaxq_u64 vbcaxq_s8 vbcaxq_s16 vbcaxq_s32 vbcaxq_s64 Note need to include +sha3 and +crypto when building from the front-end Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D96381	2021-02-22 12:09:20 +00:00
Andy Wingo	861dbe1a02	[WebAssembly] call_indirect issues table number relocs If the reference-types feature is enabled, call_indirect will explicitly reference its corresponding function table via `TABLE_NUMBER` relocations against a table symbol. Also, as before, address-taken functions can also cause the function table to be created, only with reference-types they additionally cause a symbol table entry to be emitted. We abuse the used-in-reloc flag on symbols to indicate which tables should end up in the symbol table. We do this because unfortunately older wasm-ld will carp if it see a table symbol. Differential Revision: https://reviews.llvm.org/D90948	2021-02-22 10:13:36 +01:00
Kazu Hirata	5032b5890b	[llvm] Fix header guards (NFC) Identified with llvm-header-guard.	2021-02-21 19:58:05 -08:00
madhur13490	5fe23de5db	[NFC] Remove redundant word in comment Differential Revision: https://reviews.llvm.org/D97157	2021-02-21 18:04:20 +00:00
Nikita Popov	e0615bcd39	[Loads] Add optimized FindAvailableLoadedValue() overload (NFCI) FindAvailableLoadedValue() accepts an iterator by reference. If no available value is found, then the iterator will either be left at a clobbering instruction or the beginning of the basic block. This allows using FindAvailableLoadedValue() across multiple blocks. If this functionality is not needed, as is the case in InstCombine, then we can use a much more efficient implementation: First try to find an available value, and only perform clobber checks if we actually found one. As this function only looks at a very small number of instructions (6 by default) and usually doesn't find an available value, this saves many expensive alias analysis queries.	2021-02-21 18:42:56 +01:00
Juneyoung Lee	aacf7878bc	[ValueTracking] Improve impliesPoison This patch improves ValueTracking's impliesPoison(V1, V2) to do this reasoning: ``` %res = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b) %overflow = extractvalue { i64, i1 } %res, 1 %mul = extractvalue { i64, i1 } %res, 0 ; If %mul is poison, %overflow is also poison, and vice versa. ``` This improvement leads to supporting this optimization under `-instcombine-unsafe-select-transform=0`: ``` define i1 @test2_logical(i64 %a, i64 %b, i64* %ptr) { ; CHECK-LABEL: @test2_logical( ; CHECK-NEXT: [[MUL:%.]] = mul i64 [[A:%.]], [[B:%.]] ; CHECK-NEXT: [[TMP1:%.]] = icmp ne i64 [[A]], 0 ; CHECK-NEXT: [[TMP2:%.]] = icmp ne i64 [[B]], 0 ; CHECK-NEXT: [[OVERFLOW_1:%.]] = and i1 [[TMP1]], [[TMP2]] ; CHECK-NEXT: [[NEG:%.]] = sub i64 0, [[MUL]] ; CHECK-NEXT: store i64 [[NEG]], i64 [[PTR:%.]], align 8 ; CHECK-NEXT: ret i1 [[OVERFLOW_1]] ; %res = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b) %overflow = extractvalue { i64, i1 } %res, 1 %mul = extractvalue { i64, i1 } %res, 0 %cmp = icmp ne i64 %mul, 0 %overflow.1 = select i1 %overflow, i1 true, i1 %cmp %neg = sub i64 0, %mul store i64 %neg, i64 %ptr, align 8 ret i1 %overflow.1 } ``` Previously, this didn't happen because the flag prevented `select i1 %overflow, i1 true, i1 %cmp` from being `or i1 %overflow, %cmp`. Note that the select -> or conversion happens only when `impliesPoison(%cmp, %overflow)` returns true. This improvement allows `impliesPoison` to do the reasoning. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D96929	2021-02-20 13:22:34 +09:00
Craig Topper	baab797878	[ValueTypes] Assert if changeVectorElementType is called on a simple type with an extended element type. Previously we would use the extended implementation, but the extended implementation requires the vector type to be extended so that we can access the LLVMContext. In theory we could detect this case and use the context from the element type instead, but since I know of no cases hitting this in practice today I've done the simplest thing. Also add asserts to several extended EVT functions that assume LLVMTy is non-null. Follow from discussion in D97036 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D97070	2021-02-19 17:30:46 -08:00
Mircea Trofin	82492f24ff	[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicated to the other users - namely, LiveRangeEdit. Differential Revision: https://reviews.llvm.org/D96898	2021-02-19 07:44:28 -08:00
Simon Pilgrim	aa44815f84	Remove unnecessary "using namespace llvm" inside "namespace llvm". NFCI.	2021-02-19 11:15:16 +00:00
Nikita Popov	370addb996	[IR] Move willReturn() to Instruction This moves the willReturn() helper from CallBase to Instruction, so that it can be used in a more generic manner. This will make it easier to fix additional passes (ADCE and BDCE), and will give us one place to change if additional instructions should become non-willreturn (e.g. there has been talk about handling volatile operations this way). I have also included the IntrinsicInst workaround directly in here, so that it gets applied consistently. (As such this change is not entirely NFC -- FuncAttrs will now use this as well.) Differential Revision: https://reviews.llvm.org/D96992	2021-02-19 11:56:01 +01:00
Djordje Todorovic	1a2b3536ef	Reland "[Debugify] Make the debugify aware of the original (-g) Debug Info" As discussed on the RFC [0], I am sharing the set of patches that enables checking of original Debug Info metadata preservation in optimizations. The proof-of-concept/proposal can be found at [1]. The implementation from the [1] was full of duplicated code, so this set of patches tries to merge this approach into the existing debugify utility. For example, the utility pass in the original-debuginfo-check mode could be invoked as follows: $ opt -verify-debuginfo-preserve -pass-to-test sample.ll Since this is very initial stage of the implementation, there is a space for improvements such as: - Add support for the new pass manager - Add support for metadata other than DILocations and DISubprograms [0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ [1] https://github.com/djolertrk/llvm-di-checker Differential Revision: https://reviews.llvm.org/D82545 The test that was failing is now forced to use the old PM.	2021-02-18 23:29:22 -08:00
Serge Pavlov	2c4f60e45b	[FPEnv][AArch64] Implement lowering of llvm.set.rounding Differential Revision: https://reviews.llvm.org/D96836	2021-02-19 13:16:51 +07:00
Lang Hames	0469256d35	[ORC] Print CPU feature string in JITTargetMachineBuilder debugging output.	2021-02-19 15:18:19 +11:00
Konstantin Zhuravlyov	71d1f785a5	AMDGPU/ELF: Sort MACHs by value and add missing reserved MACHs - Sort MACHs by its value - Add missing reserved MACHs - EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3D - EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3E Differential Revision: https://reviews.llvm.org/D97010	2021-02-18 20:46:27 -05:00
Wei Mi	5fb65c02ca	[SampleFDO] Stop repeated indirect call promotion for the same target. Found a problem in indirect call promotion in sample loader pass. Currently if an indirect call is promoted for a target, and if the parent function is inlined into some other function, the indirect call can be promoted for the same target again. That is redundent which can harm performance and can cause excessive compile time in some extreme case. The patch fixes the issue. If a target is promoted for an indirect call, the patch will write ICP metadata with the target call count being set to 0. In the later ICP in sample profile loader, if it sees a target has 0 count for an indirect call, it knows the target has been promoted and won't do indirect call promotion for the indirect call. The fix brings 0.1~0.2% performance on our search benchmark. Differential Revision: https://reviews.llvm.org/D96806	2021-02-18 17:01:32 -08:00
Leonard Chan	c77659e549	[llvm][IR] Do not place constants with static relocations in a mergeable section This patch provides two major changes: 1. Add getRelocationInfo to check if a constant will have static, dynamic, or no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.) 2. Only allow a constant with no relocations (static or dynamic) to be placed in a mergeable section. This will allow unused symbols that contain static relocations and happen to fit in mergeable constant sections (.rodata.cstN) to instead be placed in unique-named sections if -fdata-sections is used and subsequently garbage collected by --gc-sections. See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html. Differential Revision: https://reviews.llvm.org/D95960	2021-02-18 15:39:00 -08:00
Petr Hosek	5fbd1a333a	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 14:34:39 -08:00
Nikita Popov	70e3c9a8b6	[BasicAA] Always strip single-argument phi nodes We can always look through single-argument (LCSSA) phi nodes when performing alias analysis. getUnderlyingObject() already does this, but stripPointerCastsAndInvariantGroups() does not. We still look through these phi nodes with the usual aliasPhi() logic, but sometimes get sub-optimal results due to the restrictions on value equivalence when looking through arbitrary phi nodes. I think it's generally beneficial to keep the underlying object logic and the pointer cast stripping logic in sync, insofar as it is possible. With this patch we get marginally better results: aa.NumMayAlias \| 5010069 \| 5009861 aa.NumMustAlias \| 347518 \| 347674 aa.NumNoAlias \| 27201336 \| 27201528 ... licm.NumPromoted \| 1293 \| 1296 I've renamed the relevant strip method to stripPointerCastsForAliasAnalysis(), as we're past the point where we can explicitly spell out everything that's getting stripped. Differential Revision: https://reviews.llvm.org/D96668	2021-02-18 23:07:50 +01:00
Petr Hosek	fbf8b957fd	Revert "[Coverage] Store compilation dir separately in coverage mapping" This reverts commit `97ec8fa5bb` since the test is failing on some bots.	2021-02-18 12:50:24 -08:00
Petr Hosek	97ec8fa5bb	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 12:27:42 -08:00
Sam Powell	eb2eeeb76f	[llvm][TextAPI] add equality operator for InterfaceFile This patch adds functionality to compare for the equality between `InterfaceFile`s based on attributes specific to linking. Reviewed By: cishida, steven_wu Differential Revision: https://reviews.llvm.org/D96629	2021-02-18 11:53:08 -08:00
Ta-Wei Tu	f70cdc5b5c	[NPM] Properly reset parent loop after loop passes This fixes https://bugs.llvm.org/show_bug.cgi?id=49185 When `NDEBUG` is not set, `LPMUpdater` checks if the added loops have the same parent loop as the current one in `addSiblingLoops`. If multiple loop passes are executed through `LoopPassManager`, `U.ParentL` will be the same across all passes. However, the parent loop might change after running a loop pass, resulting in assertion failures in subsequent passes. This patch resets `U.ParentL` after running individual loop passes in `LoopPassManager`. Reviewed By: asbirlea, ychen Differential Revision: https://reviews.llvm.org/D96727	2021-02-19 02:50:53 +08:00
Bradley Smith	8bad8a43c3	[AArch64][SVE] Add patterns to generate FMLA/FMLS/FNMLA/FNMLS/FMAD Adjust generateFMAsInMachineCombiner to return false if SVE is present in order to combine fmul+fadd into fma. Also add new pseudo instructions so as to select the most appropriate of FMLA/FMAD depending on register allocation. Depends on D96599 Differential Revision: https://reviews.llvm.org/D96424	2021-02-18 16:55:16 +00:00
Paul C. Anagnostopoulos	49d663d546	Revert "[TableGen] Improve algorithms for processing template arguments" This reverts commit e589207d5aaee6cbf1d7c7de8867a17727d14aca.	2021-02-18 09:26:26 -05:00
Paul C. Anagnostopoulos	d248cce44e	[TableGen] Improve algorithms for processing template arguments Rework template argument checking so that all arguments are type-checked and cast if necessary. Add a test. Differential Revision: https://reviews.llvm.org/D96416	2021-02-18 09:15:26 -05:00
Djordje Todorovic	c1e23894fc	Revert "[Debugify] Make the debugify aware of the original (-g) Debug Info" This reverts rG8ee7c7e02953. One test is failing, I'll reland this as soon as possible.	2021-02-18 02:04:27 -08:00
Djordje Todorovic	8ee7c7e029	[Debugify] Make the debugify aware of the original (-g) Debug Info As discussed on the RFC [0], I am sharing the set of patches that enables checking of original Debug Info metadata preservation in optimizations. The proof-of-concept/proposal can be found at [1]. The implementation from the [1] was full of duplicated code, so this set of patches tries to merge this approach into the existing debugify utility. For example, the utility pass in the original-debuginfo-check mode could be invoked as follows: $ opt -verify-debuginfo-preserve -pass-to-test sample.ll Since this is very initial stage of the implementation, there is a space for improvements such as: - Add support for the new pass manager - Add support for metadata other than DILocations and DISubprograms [0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ [1] https://github.com/djolertrk/llvm-di-checker Differential Revision: https://reviews.llvm.org/D82545	2021-02-18 01:52:16 -08:00
Chen Zheng	4c23707a41	[XCOFF][NFC] make StorageMappingClass/SymbolType member optional This patch makes StorageMappingClass/SymbolType member optional in class MCSectionXCOFF. Non-csect sections like debug sections have no such properties. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D96641	2021-02-18 04:46:05 -05:00
Fangrui Song	018a484cd2	[llvm-objdump] Map STT_TLS to ST_Other (previously ST_Data) ST_Data is used to model BFD `BFD_OBJECT`. A STT_TLS symbol does not have the `BFD_OBJECT` flag in BFD. This makes sense because a STT_TLS symbol is like in a different address space, normal data/object properties do not apply on them. With this change, a STT_TLS symbol will not be displayed as 'O'. This new behavior matches objdump. Differential Revision: https://reviews.llvm.org/D96735	2021-02-17 23:17:20 -08:00
Chen Zheng	5517923b1c	[XCOFF][NFC] make csect properties optional for getXCOFFSection We are going to support debug sections for XCOFF. So the csect properties are not necessary. This patch makes these properties optional. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D95931	2021-02-17 20:51:42 -05:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
Rahman Lavaee	0252e6ead1	[obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field. As discussed in D95511, this allows us to encode invalid BBAddrMap sections to be used in more rigorous testing. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96831	2021-02-17 15:45:13 -08:00
Rong Xu	7397905ab0	[SampleFDO] Third Try: Refactor SampleProfile.cpp Apply the patch for the third time after fixing buildbot failures. Refactor SampleProfile.cpp to use the core code in CodeGen. The main changes are: (1) Move SampleProfileLoaderBaseImpl class to a header file. (2) Split SampleCoverageTracker to a head file and a cpp file. (3) Move the common codes (common options and callsiteIsHot()) to the common cpp file. (4) Add inline keyword to avoid duplicated symbols -- they will be removed later when the class is changed to a template. Differential Revision: https://reviews.llvm.org/D96455	2021-02-17 15:31:50 -08:00
Jessica Paquette	60aa646441	[GlobalISel] Add G_ASSERT_SEXT This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction signifies that an operation was already sign extended from a smaller type. This is useful for functions with sign-extended parameters. E.g. ``` define void @foo(i16 signext %x) { ... } ``` This adds verifier, regbankselect, and instruction selection support for G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT. Differential Revision: https://reviews.llvm.org/D96890	2021-02-17 13:10:34 -08:00

... 12 13 14 15 16 ...

45424 Commits