llvm-project

Commit Graph

Author	SHA1	Message	Date
Dmitry Preobrazhensky	955cc56af4	[AMDGPU][GFX1030][DOC][NFC] Update assembler syntax description Summary of changes: - Update FLAT LDS syntax (see https://reviews.llvm.org/D125126)	2022-07-28 14:36:53 +03:00
Renato Golin	94761b9dba	Update ProgrammersManual STL docs The SGI page doesn't exist anymore and isn't really relevant at this day and age. While at it, added the "other" main C++ website and moved all URLs to HTTPS.	2022-07-27 10:30:47 +01:00
Tom Stellard	809855b56f	Bump the trunk major version to 16	2022-07-26 21:34:45 -07:00
Dmitry Preobrazhensky	9891bb2302	[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description Summary of changes: - Update FLAT LDS syntax (see https://reviews.llvm.org/D125126)	2022-07-26 19:33:31 +03:00
John Ericson	a5640968f2	[llvm][cmake] Follow up to D117973 1. Slightly document the "mark advanced" variable used to control the installed CMake package dir. I would document it more, but I am considering in the future adding pkg-config support in this manner, after which `_PACKGE_DIR` is probably better called `_CMAKE_PACKGE_DIR` or similar. 2. Convey the custom path to the legacy `llvm-config` binary. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D130539	2022-07-26 14:51:12 +00:00
Augie Fackler	63b1582350	LangRef: note that `allockind("free")` requires void return Otherwise we have to work pretty hard to ensure a discarded alloc/free pair doesn't remove a return value that's still useful. Differential Revision: https://reviews.llvm.org/D130568	2022-07-26 10:10:14 -04:00
David Spickett	2f9fa9ef53	[lldb][AArch64] Add support for memory tags in core files This teaches ProcessElfCore to recognise the MTE tag segments. https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support These segments contain all the tags for a matching memory segment which will have the same size in virtual address terms. In real terms it's 2 tags per byte so the data in the segment is much smaller. Since MTE is the only tag type supported I have hardcoded some things to those values. We could and should support more formats as they appear but doing so now would leave code untested until that happens. A few things to note: * /proc/pid/smaps is not in the core file, only the details you have in "maps". Meaning we mark a region tagged only if it has a tag segment. * A core file supports memory tagging if it has at least 1 memory tag segment, there is no other flag we can check to tell if memory tagging was enabled. (unlike a live process that can support memory tagging even if there are currently no tagged memory regions) Tests have been added at the commands level for a core file with mte and without. There is a lot of overlap between the "memory tag read" tests here and the unit tests for MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's worth keeping to check ProcessElfCore doesn't cause an assert. Depends on D129487 Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D129489	2022-07-26 08:46:36 +01:00
Justin Brooks	fb95b8dc35	[Kaleidoscope] Fix DWARF function creation example The full code listing was fixed in `fdaeb0c647` Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D130217	2022-07-25 18:19:59 +00:00
Nikita Popov	b66ca91fe6	[Docs] Update GEP docs for opaque pointers Update the GEP FAQ to use opaque pointers. This requires more than a syntactic change in some place, because some of the concerns just don't make sense anymore (trying to index past a ptr member in a struct for example). This also fixes uses of incorrect syntax to declare or reference globals. Differential Revision: https://reviews.llvm.org/D130353	2022-07-25 09:52:14 +02:00
Nikita Popov	7ac7ec8202	[LangRef] Update for opaque pointers (NFC) Update LangRef examples to use opaque pointers in most places. I've retained typed pointers in a few cases where opaque pointers don't make much sense, e.g. pointer to pointer bitcasts. Differential Revision: https://reviews.llvm.org/D130356	2022-07-25 09:45:49 +02:00
Fangrui Song	ef03f6623c	[llvm-objcopy] Simplify --compress-debug-sections handling with AliasArgs. NFC	2022-07-25 00:31:00 -07:00
zhijian	74cb8dfaac	[AIX][NFC] modify the llvm-ar help information for big archive. Reviewers: James Henderson Differential Revision: https://reviews.llvm.org/D130292	2022-07-22 13:52:18 -04:00
zhijian	4f2cfbe531	[llvm-ar] Add object mode option -X for AIX Summary: 1. Added a new option object mode -X for llvm-ar. In AIX OS , there is a object mode option -X for ar command. please see the "-X mode" part of https://www.ibm.com/docs/ko/aix/7.1?topic=ar-command Specifies the type of object file ar should examine. The mode must be one of the following: 32 Processes only 32-bit object files 64 Processes only 64-bit object files 32_64 Processes both 32-bit and 64-bit object files any Processes all of the supported object files. The default is to process 32-bit object files (ignore 64-bit objects). The mode can also be set with the OBJECT_MODE environment variable. For example, OBJECT_MODE=64 causes ar to process any 64-bit objects and ignore 32-bit objects. The -X flag overrides the OBJECT_MODE variable. 2. Before adding the new option -X, the default behaviors of llvm-ar like -Xany, but after the adding the new option -X, the default behaviors of llvm-ar change to -X32 ,in order to let some test cases which has 32bit and 64bit object file in the same llvm-ar command, we need to add the "export OBJECT_MODE=any" into test case to change the default behaviors of llvm-ar's object mode. Reviewers: James Henderson, Owen Reynolds, Fangrui Song Differential Revision: https://reviews.llvm.org/D127864	2022-07-22 09:55:21 -04:00
Nikita Popov	5ab077f911	[LangRef] Update opaque pointers status (NFC) Opaque pointers support is complete and default. Specify ptr as the normal pointer type and i8* as something supported under non-default options. A larger update of examples in LangRef is still needed.	2022-07-22 14:47:31 +02:00
Nikita Popov	5102084787	[Docs] Add release notes for opaque pointers (NFC)	2022-07-22 14:14:03 +02:00
Daniel Thornburgh	17e4c217b6	[Symbolizer] Implement contextual symbolizer markup elements. This change implements the contextual symbolizer markup elements: reset, module, and mmap. These provide information about the runtime context of the binary necessary to resolve addresses to symbolic values. Summary information is printed to the output about this context. Multiple mmap elements for the same module line are coalesced together. The standard requires that such elements occur on their own lines to allow for this; accordingly, anything after a contextual element on a line is silently discarded. Implementing this cleanly requires that the filter drive the parser; this allows skipped sections to avoid being parsed. This also makes the filter quite a bit easier to use, at the cost of some unused flexibility. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D129519	2022-07-21 11:29:19 -07:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit `57224ff4a6`. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Yusra Syeda	6fb27bc2e3	[SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention Differential Revision: https://reviews.llvm.org/D127328	2022-07-19 13:55:25 -04:00
Alexey Lapshin	4539b44148	[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 15:11:36 +03:00
Alexey Lapshin	e717f91c96	Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF." This reverts commit `e2147c26bd`.	2022-07-19 12:17:47 +03:00
Alexey Lapshin	e2147c26bd	[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 11:18:36 +03:00
Alex Bradbury	86c4242976	[docs] Remove unmaintained target feature matrix Back in 2017, a table was added to the codegen documentation listing which features various backends support. It received a few updates since then, but not since the end of 2019. Having such a table is a nice idea, but it hasn't been kept up to date, it isn't easy to ensure that it is up to date, and the table probably isn't very discoverable for most users who would be interested in this information anyway (it would be better suited to some kind of "what can LLVM do for me?" page). For all of the above reasons, I believe it makes sense to remove it. Differential Revision: https://reviews.llvm.org/D129996	2022-07-18 18:38:23 +01:00
Fangrui Song	b3fd3a9ac3	[IR] Allow absence for Min module flags and make AArch64 BTI/PAC-RET flags backward compatible D123493 introduced llvm::Module::Min to encode module flags metadata for AArch64 BTI/PAC-RET. llvm::Module::Min does not take effect when the flag is absent in one module. This behavior is misleading and does not address backward compatibility problems (when a bitcode with "branch-target-enforcement"==1 and another without the flag are merged, the merge result is 1 instead of 0). To address the problems, require Min flags to be non-negative and treat absence as having a value of zero. For an old bitcode without "branch-target-enforcement"/"sign-return-address", its value is as if 0. Differential Revision: https://reviews.llvm.org/D129911	2022-07-18 09:35:12 -07:00
Dmitry Preobrazhensky	ca2e3ffbc1	[AMDGPU][GFX90A][DOC][NFC] Update assembler syntax description Update FLAT LDS syntax (see https://reviews.llvm.org/D125126).	2022-07-18 13:56:50 +03:00
Dmitry Preobrazhensky	7648e8d9ca	[AMDGPU][GFX9][DOC][NFC] Update assembler syntax description Update FLAT LDS syntax (see https://reviews.llvm.org/D125126).	2022-07-18 13:52:05 +03:00
Abinav Puthan Purayil	d96361d714	[AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map This change introduces the dynamic stack boolean field to code-object-v3 and above under the code properties of the kernel descriptor and under the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to the is_dynamic_callstack field of amd_kernel_code_t. Differential Revision: https://reviews.llvm.org/D128344	2022-07-18 10:07:13 +05:30
David Kreitzer	c720b6fddd	Clarify the behavior of the llvm.vector.insert/extract intrinsics when the index is out of range. Both intrinsics return a poison value. Consequently, mark the intrinsics speculatable. Differential Revision: https://reviews.llvm.org/D129656	2022-07-15 07:56:44 -07:00
Hans Wennborg	07022e6cf9	[docs] Note about how to handle 'llvm-mt: error: no libxml2' See https://github.com/llvm/llvm-project/issues/55817 and https://discourse.llvm.org/t/cannot-cmake-self-hosted-clang-on-windows-for-lack-of-libxml2/58793 Not sure where is the best place to put this, but hopefully this will be found by those searching for the error message. Differential revision: https://reviews.llvm.org/D129770	2022-07-15 16:03:07 +02:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Tom Stellard	5b0788fef8	Remove left over merge marker from `4b1e3d1937`	2022-07-14 14:51:44 -07:00
Tom Stellard	4b1e3d1937	[gold] Ignore bitcode from sections inside object files -fembed-bitcode will put bitcode into special sections within object files, but this is not meant to be used by LTO, so the gold plugin should ignore it. https://github.com/llvm/llvm-project/issues/47216 Reviewed By: tejohnson, MaskRay Differential Revision: https://reviews.llvm.org/D116995	2022-07-14 14:46:15 -07:00
Nick Desaulniers	140bfdca60	[clang][CodeGen] add fn_ret_thunk_extern to synthetic fns Follow up fix to commit `2240d72f15` ("[X86] initial -mfunction-return=thunk-extern support") https://reviews.llvm.org/D129572 @nathanchance reported that -mfunction-return=thunk-extern was failing to annotate the asan and tsan contructors. https://lore.kernel.org/llvm/Ys7pLq+tQk5xEa%2FB@dev-arch.thelio-3990X/ I then noticed the same occurring for gcov synthetic functions. Similar to commit `2786e67` ("[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all}") define a new module level MetaData, "fn_ret_thunk_extern", then when set adds the fn_ret_thunk_extern IR Fn Attr to synthetically created Functions. Fixes https://github.com/llvm/llvm-project/issues/56514 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129709	2022-07-14 11:25:24 -07:00
Fangrui Song	52cb972537	[CommandLine] --help: print "-o <xxx>" instead of "-o=<xxx>" Accepting -o= is a quirk of CommandLine. For --help, we should print the conventional "-o <xxx>".	2022-07-14 01:28:28 -07:00
Maksim Panchenko	aa8c517ae4	[docs] Add BOLT Office Hours Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D129408	2022-07-13 14:22:00 -07:00
Fangrui Song	0b266f22c3	[docs][llvm-objcopy] Fix unpaired `<align>``	2022-07-13 10:14:26 -07:00
Fangrui Song	b28412d539	[llvm-objcopy][ELF] Add --set-section-type The request is mentioned on D129053. I feel that having this functionality is mildly useful (not strong). * Rename .ctors to .init_array and change sh_type to SHT_INIT_ARRAY (GNU objcopy detects the special name but we don't). * Craft tests for a new SHT_LLVM_* extension Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129337	2022-07-13 10:04:21 -07:00
Mitch Phillips	fd6dae9799	Update sanitize_* IR documentation. sanitize_none was never actually committed, and should be removed. no_sanitize_memtag is to be removed in D128950. sanitize_memtag is new in D128950. Also update the comments on other no_sanitize_* to indicate that they're impacted by the sanitizer ignorelist and the global-disable attribute. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D129410	2022-07-13 08:54:41 -07:00
Yuanfang Chen	fcb7d76d65	[coroutine] add nomerge function attribute to `llvm.coro.save` It is illegal to merge two `llvm.coro.save` calls unless their `llvm.coro.suspend` users are also merged. Marks it "nomerge" for the moment. This reverts D129025. Alternative to D129025, which affects other token type users like WinEH. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D129530	2022-07-12 10:39:38 -07:00
Nick Desaulniers	2240d72f15	[X86] initial -mfunction-return=thunk-extern support Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute Where the supported <value>s are: * keep (disable) * thunk-extern (enable) thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern. The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk! This new MachineFunctionPass is very similar to "x86-lvi-ret". The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere. Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk"). Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901 See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE. This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release. Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information. Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html Reviewed By: aaron.ballman, craig.topper Differential Revision: https://reviews.llvm.org/D129572	2022-07-12 09:17:54 -07:00
Nikita Popov	4bb7b6fae3	[IR] Remove support for float binop constant expressions As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179, this removes support for the floating-point binop constant expressions fadd, fsub, fmul, fdiv and frem. As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub, LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed. The LLVMBuild APIs should be used instead. Differential Revision: https://reviews.llvm.org/D129478	2022-07-12 09:40:49 +02:00
Xiang1 Zhang	a45dd3d814	[X86] Support -mstack-protector-guard-symbol Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129346	2022-07-12 10:17:00 +08:00
Xiang1 Zhang	643786213b	Revert "[X86] Support -mstack-protector-guard-symbol" This reverts commit `efbaad1c4a`. due to miss adding review info.	2022-07-12 10:14:32 +08:00
Xiang1 Zhang	efbaad1c4a	[X86] Support -mstack-protector-guard-symbol	2022-07-12 10:13:48 +08:00
Joseph Huber	ec2b040e18	[llvm-objdump][docs] Fix documentation for offloading flags	2022-07-11 15:44:48 -04:00
mphschmitt	74d62c0a8a	[llvm-objdump][docs] fix typo in llvm-objdump documentation. Fix a typo in llvm-objdump documentation. Differential Revision: https://reviews.llvm.org/D129445 Reviewed by: jhuber6	2022-07-11 15:44:09 -04:00
Nick Desaulniers	ef4beb8bc7	[llvm][docs] commit phabricator patch Users upgrading to PHP 8.1 might start observing failures with `arc`. Commit @ychen's suggestions as a patch in tree that can be applied since arcanist is no longer accepting patches. Also, remove the suggestion to apply an external patch updating CA certs. It seems that this was fixed in upstream arcanist before they stopped accepting patches. Compare `e3659d43d8` vs `13d3a3c3b1` Link: https://secure.phabricator.com/book/phabcontrib/article/contributing_code/ Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129232	2022-07-11 12:33:57 -07:00
Venkata Ramanaiah Nalamothu	370266aec5	[llvm][docs] Fix typos to say subclasses need to override virtual methods but not overload Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D129484	2022-07-11 22:25:14 +05:30
Fangrui Song	7c03b7d668	[llvm-objcopy][ELF] Allow --set-section-flags src=... and --rename-section src=tst * GNU objcopy supports --set-section-flags src=... --rename-section src=tst and --set-section-flags runs first. * GNU objcopy processes --update-section before --rename-section. To match the two behaviors, postpone --rename-section and allow its use together with --set-section-flags. As a side effect, --rename-section=.foo1=.foo2 --add-section=.foo1=/dev/null leads to .foo2 while GNU objcopy surprisingly produces .foo1 (so --set-section-flags --add-section --rename-section do not form a total order). I think the deviation is fine as a total order makes more sense. Rename set-section-flags-and-rename.test to set-section-attr-and-rename.test and additionally test --set-section-alignment Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129336	2022-07-11 09:04:45 -07:00
Cole Kissane	96063bfa90	[llvm] Remove unused and redundant crc32 funcction from llvm::compression::zlib namespace * Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead. Reviewed By: MaskRay, leonardchan Differential Revision: https://reviews.llvm.org/D128754	2022-07-08 11:24:45 -07:00
Cole Kissane	ea61750c35	[NFC] Refactor llvm::zlib namespace * Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953	2022-07-08 11:19:07 -07:00
Matt Arsenault	1ee6ce9bad	GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD SelectionDAG has a target hook, getExtendForAtomicOps, which it uses in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty ugly (as is having a separate load opcode for atomics), so instead allow making use of atomic zextload. Enable this for AArch64 since the DAG path defaults in to the zext behavior. The tablegen changes are pretty ugly, but partially helps migrate SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with atomic memory operands. For now the DAG emitter will emit matchers for patterns which the DAG will not produce. I'm still a bit confused by the intent of the isLoad/isStore/isAtomic bits. The DAG implementation rejects trying to use any of these in combination. For now I've opted to make the isLoad checks also check isAtomic, although I think having isLoad and isAtomic set on these makes most sense.	2022-07-08 11:55:08 -04:00
Joseph Huber	85768677f8	[llvm-objdump][Docs] Document new flag	2022-07-07 20:41:53 -04:00
Fangrui Song	472aa7e6bb	[docs] Move code contribution from GettingStarted.rst to Contributing.rst For code contribution, GettingStarted.rst duplicates information in Contributing.rst. The dedicated Contributing.rst is a better place for code contribution, so move the content there. Notes: * D41665 added `Contributing.rst` * D110976 mentioned `git cherry-pick e3659d43d8911e91739f3b0c5935598bceb859aa` workaround Reviewed By: cjdb, fhahn, nickdesaulniers Differential Revision: https://reviews.llvm.org/D129255	2022-07-07 10:51:20 -07:00
Joseph Huber	41fba3c107	[Metadata] Add 'exclude' metadata to add the exclude flags on globals This patchs adds a new metadata kind `exclude` which implies that the global variable should be given the necessary flags during code generation to not be included in the final executable. This is done using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it easier to specify this flag on a variable without needing to explicitly check the section name in the target backend. Depends on D129053 D129052 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129151	2022-07-07 12:20:40 -04:00
Joseph Huber	1d2ce4da84	[Object] Add ELF section type for offloading objects Currently we use the `.llvm.offloading` section to store device-side objects inside the host, creating a fat binary. The contents of these sections is currently determined by the name of the section while it should ideally be determined by its type. This patch adds the new `SHT_LLVM_OFFLOADING` section type to the ELF section types. Which should make it easier to identify this specific data format. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129052	2022-07-07 12:20:30 -04:00
Joseph Huber	ed801ad5e5	[Clang] Use metadata to make identifying embedded objects easier Currently we use the `embedBufferInModule` function to store binary strings containing device offloading data inside the host object to create a fatbinary. In the case of LTO, we need to extract this object from the LLVM-IR. This patch adds a metadata node for the embedded objects containing the embedded pointers and the sections they were stored at. This should create a cleaner interface for identifying these values. In the future it may be worthwhile to also encode an `ID` in the metadata corresponding to the object's special section type if relevant. This would allow us to extract the data from an object file and LLVM-IR using the same ID. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129033	2022-07-07 12:20:25 -04:00
Nicolai Hähnle	fdf7e437bf	llvm-c: Add LLVMDeleteInstruction to fix a test issue Not deleting the loose instruction with metadata associated to it causes an assertion when the LLVMContext is destroyed. This was previously hidden by the fact that llvm-c-test does not call LLVMShutdown. The planned removal of ManagedStatic exposed this issue. Differential Revision: https://reviews.llvm.org/D129114	2022-07-07 14:29:20 +02:00
Shilei Tian	1023ddaf77	[LLVM] Add the support for fmax and fmin in atomicrmw instruction This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D127041	2022-07-06 10:57:53 -04:00
Paul Robinson	08e4fe6c61	[X86] Add RDPRU instruction Add support for the RDPRU instruction on Zen2 processors. User-facing features: - Clang option -m[no-]rdpru to enable/disable the feature - Support is implicit for znver2/znver3 processors - Preprocessor symbol __RDPRU__ to indicate support - Header rdpruintrin.h to define intrinsics - "rdpru" mnemonic supported for assembler code Internal features: - Clang builtin __builtin_ia32_rdpru - IR intrinsic @llvm.x86.rdpru Differential Revision: https://reviews.llvm.org/D128934	2022-07-06 07:17:47 -07:00
Dmitry Preobrazhensky	2044e4c53e	[AMDGPU][GFX1030][DOC][NFC] Update assembler syntax description Summary of changes: - Update MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Add v_cvt_pkrtz_f16_f32_dpp, v_cvt_pkrtz_f16_f32_sdwa. - Update SMEM syntax (see https://reviews.llvm.org/D127314). - Enable op_sel for v_add_nc_u16, v_sub_nc_u16 (see https://reviews.llvm.org/D123594). - Minor bug fixing and improvements.	2022-07-06 16:54:30 +03:00
Lucas Prates	e0af055741	[Docs] Add release note for ARM's new -mframe-chain option This adds a release note entry for the new -mframe-chain option introduced on D125094. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D129085	2022-07-06 10:07:15 +01:00
Nikita Popov	11950efe06	[ConstExpr] Remove div/rem constant expressions D128820 stopped creating div/rem constant expressions by default; this patch removes support for them entirely. The getUDiv(), getExactUDiv(), getSDiv(), getExactSDiv(), getURem() and getSRem() on ConstantExpr are removed, and ConstantExpr::get() now only accepts binary operators for which ConstantExpr::isSupportedBinOp() returns true. Uses of these methods may be replaced either by corresponding IRBuilder methods, or ConstantFoldBinaryOpOperands (if a constant result is required). On the C API side, LLVMConstUDiv, LLVMConstExactUDiv, LLVMConstSDiv, LLVMConstExactSDiv, LLVMConstURem and LLVMConstSRem are removed and corresponding LLVMBuild methods should be used. Importantly, this also means that constant expressions can no longer trap! This patch still keeps the canTrap() method to minimize diff -- I plan to drop it in a separate NFC patch. Differential Revision: https://reviews.llvm.org/D129148	2022-07-06 10:11:34 +02:00
Alexey Bader	9892706282	Updating office hours	2022-07-05 07:12:11 -04:00
Archibald Elliott	1666f09933	[ARM] Add Support for Cortex-M85 This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI, both of which are enabled by default. Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and David Green. Differential Revision: https://reviews.llvm.org/D128415	2022-07-05 10:43:31 +01:00
Dmitry Preobrazhensky	f90f0e8fe7	[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description Summary of changes: - Update MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Add v_cvt_pkrtz_f16_f32_dpp, v_cvt_pkrtz_f16_f32_sdwa. - Update SMEM syntax (see https://reviews.llvm.org/D127314). - Enable op_sel for v_add_nc_u16, v_sub_nc_u16 (see https://reviews.llvm.org/D123594). - Minor bug fixing and improvements.	2022-07-04 13:30:56 +03:00
Edd Barrett	04f6bf482b	Revise outdated parts of the developer policy. Specifically: - Diffs are not passed around on mailing lists any more. - Diffs should be `-U999999`. - Clarify part about automated emails. Differential review: https://reviews.llvm.org/D128645	2022-07-04 07:05:29 +01:00
Nikita Popov	7283f48a05	[IR] Remove support for insertvalue constant expression This removes the insertvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. This is very similar to the extractvalue removal from D125795. insertvalue is also not supported in bitcode, so no auto-ugprade is necessary. ConstantExpr::getInsertValue() can be replaced with IRBuilder::CreateInsertValue() or ConstantFoldInsertValueInstruction(), depending on whether a constant result is required (with the latter being fallible). The ConstantExpr::hasIndices() and ConstantExpr::getIndices() methods also go away here, because there are no longer any constant expressions with indices. Differential Revision: https://reviews.llvm.org/D128719	2022-07-04 09:27:22 +02:00
Chen Zheng	2c3784cff8	[SCEV] recognize llvm.annotation intrinsic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127835	2022-07-03 21:02:50 -04:00
Dmitry Preobrazhensky	3a4d9b6a68	[AMDGPU][GFX908][DOC][NFC] Update assembler syntax description Summary of changes: - Remove dst for global_atomic_add_f32, global_atomic_pk_add_f16. - Make vdata input-only for buffer_atomic_add_f32, buffer_atomic_pk_add_f16. - Other minor improvements.	2022-07-01 12:46:45 +03:00
Dmitry Preobrazhensky	36c9e9968a	[AMDGPU][GFX940][DOC][NFC] Update assembler syntax description Summary of changes: - Update SMEM syntax (see https://reviews.llvm.org/D127314). - Minor improvements.	2022-07-01 12:22:57 +03:00
Marc Auberer	972fe43133	[Kaleidoscope] Remove unused function argument Removes an unused function argument from a code listing in the Kaleidoscope turorial in step 9. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D128628	2022-06-30 20:47:01 +00:00
Kostya Serebryany	92fb310151	[libFuzzer] Extend the fuzz target intarface to allow -1 return value. With this change, fuzz targets may choose to return -1 to indicate that the input should not be added to the corpus regardless of the coverage it generated. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D128749	2022-06-30 13:21:27 -07:00
Fangrui Song	2601b90d83	[llvm-objdump] Default to --mcpu=future for PPC64 GNU objdump disassembles all unknown instructions by default. Match this user friendly behavior with the cpu value `future`. Differential Revision: https://reviews.llvm.org/D127824	2022-06-30 11:30:35 -07:00
Fangrui Song	275862c75d	[llvm-objdump] Default to --mattr=+all for AArch64 GNU objdump disassembles all unknown instructions by default. Match this user friendly behavior with the target feature "all" (D128029) designed for disassemblers. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D128030	2022-06-30 11:17:56 -07:00
Daniel Thornburgh	05a4b64035	[llvm-dwarfdump] --show-sources option to show all sources This option allows printing all sources used by an object file. Reviewed By: dblaikie, jhenderson Differential Revision: https://reviews.llvm.org/D87656	2022-06-30 09:53:08 -07:00
Fangrui Song	45ae553109	[llvm-objcopy] Remove support for legacy .zdebug sections clang 14 removed -gz=zlib-gnu support and ld.lld removed linker input support for zlib-gnu in D126793. Now let's remove zlib-gnu from llvm-objcopy. * .zdebug* sections are no longer recognized as debug sections. --strip* don't remove them. They are copied like other opaque sections * --decompress-debug-sections does not uncompress .zdebug* sections * --compress-debug-sections=zlib-gnu is not supported It is very rare but in case a user has object files using .zdebug . They can use llvm-objcopy<15 or GNU objcopy for uncompression. --compress-debug-sections=zlib-gnu is unlikely ever used by anyone, so I do not add a custom diagnostic. Differential Revision: https://reviews.llvm.org/D128688	2022-06-29 10:42:55 -07:00
Fangrui Song	bf223e43fe	[llvm-ar] Add --output to specify output directory From binutils 2.34 onwards, ar supports --output to specify a directory where archive members should be extracted to. Port this feature. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D128626	2022-06-29 10:00:43 -07:00
Dmitry Preobrazhensky	1774f2e326	[AMDGPU][GFX90a][DOC][NFC] Update assembler syntax description Summary of changes: - Update MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Update SMEM syntax (see https://reviews.llvm.org/D127314). - Enable src0=literal for v_madak, v_madmk (see https://reviews.llvm.org/D111067). - Correct src0 operands of v_accvgpr_write_b32. - Correct description of s_getreg/s_setreg (add TBA/TMA). - Remove SYSMSG_OP_HOST_TRAP_ACK message. - Minor bug fixing and improvements.	2022-06-29 13:31:09 +03:00
Rahman Lavaee	0aa6df6575	[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks. This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format. Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary). The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D121346	2022-06-28 07:42:54 -07:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Yuanfang Chen	6e2b3cc6ca	Fix sphinx docs build Fix "Title underline too short."	2022-06-27 12:22:04 -07:00
Yuanfang Chen	6678f8e505	[ubsan] Using metadata instead of prologue data for function sanitizer Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`. This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node. This and D116130 fix https://github.com/llvm/llvm-project/issues/49689. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D115844	2022-06-27 12:09:13 -07:00
Daniel Thornburgh	eb5af0acf0	[Symbolize] Add log markup --filter to llvm-symbolizer. This adds a --filter option to llvm-symbolizer. This takes log-bearing symbolizer markup from stdin and writes a human-readable version to stdout. For now, this only implements the "symbol" markup tag; all others are passed through unaltered. This is a proof-of-concept bit of functionalty; implement the various tags is more-or-less just a matter of hooking up various parts of the Symbolize library to the architecture established here. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D126980	2022-06-27 10:44:15 -07:00
Chris Bieneman	ee0dd2ec11	[Docs] Update clang & llvm release notes for HLSL Adding release note entries for LLVM & Clang to introduce the HLSL & DirectX support that is being added. Reviewed By: aaron.ballman, MaskRay Differential Revision: https://reviews.llvm.org/D127890	2022-06-27 12:41:14 -05:00
Dmitry Preobrazhensky	480f3e0228	[AMDGPU][GFX9][DOC][NFC] Update assembler syntax description Summary of changes: - Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Updated SMEM syntax (see https://reviews.llvm.org/D127314). - Enabled src0=literal for v_madak, v_madmk (see https://reviews.llvm.org/D111067). - Removed SYSMSG_OP_HOST_TRAP_ACK message. - Minor bug fixing and improvements.	2022-06-27 14:03:58 +03:00
Edd Barrett	94fbb147c8	[STACKMAPS] Document+test UINT64_MAX stack size. When a function does a dynamic stack allocation, the function's stack size (in the stack map) is reported as UINT64_MAX. This change tests and documents this property. Differential Revision: https://reviews.llvm.org/D128525	2022-06-27 11:57:07 +01:00
Bradley Smith	a83aa33d1b	[IR] Move vector.insert/vector.extract out of experimental namespace These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of the experimental namespace. Differential Revision: https://reviews.llvm.org/D127976	2022-06-27 10:48:45 +00:00
Venkata Ramanaiah Nalamothu	5d2cc4d838	[AMDGPU][NFC] Correct typo in DWARF Extensions For Heterogeneous Debugging The `DW_AT_LLVM_address_space` attribute is mentioned as `DW_AT_address_space` in one of the notes. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D128210	2022-06-24 11:05:08 +05:30
Arthur Eubanks	865812c3af	[docs][NewPM] Add more info on why accessing mutable outer analyses is disallowed Reviewed By: asbirlea, rnk Differential Revision: https://reviews.llvm.org/D128374	2022-06-23 10:05:37 -07:00
Nikita Popov	da34966a5a	[llvm-c] Add LLVMGetAggregateElement() function This adds LLVMGetAggregateElement() as a wrapper for Constant::getAggregateElement(), which allows fetching a struct/array/vector element without handling different possible underlying representations. As the changed echo test shows, previously you for example had to treat ConstantArray (use LLVMGetOperand) and ConstantDataArray (use LLVMGetElementAsConstant) separately, not to mention all the other possible representations (like PoisonValue). I've deprecated LLVMGetElementAsConstant() in favor of the new function, which is strictly more powerful (but I could be convinced to drop the deprecation). This is partly motivated by https://reviews.llvm.org/D125795, which drops LLVMConstExtractValue() because the underlying constant expression no longer exists. This function could previously be used as a poor man's getAggregateElement(). Differential Revision: https://reviews.llvm.org/D128417	2022-06-23 14:50:54 +02:00
Kristof Beyls	91139cee15	[docs] Document and publish LLVM community calendar Let's introduce and publish an LLVM community calendar. The idea is that organizers of events such as online sync-ups or office hours invite calendar@llvm.org to the event they're creating. That way, the calendar publicly visible at https://calendar.google.com/calendar/u/0/embed?src=calendar@llvm.org will show the event. The hope is that having a single calendar showing all LLVM events makes it easier for both new comers and experienced people to discover events they're interested in. This patch partially implements https://github.com/llvm/llvm-project/issues/55426 We could also give pointers to the calendar in a few other places, e.g. from the main LLVM page, but let's introduce the incrementally. Differential Revision: https://reviews.llvm.org/D127852	2022-06-23 13:29:41 +02:00
wangpc	634484885c	[TableGen] Add new operator !exists We can cast a string to a record via !cast, but we have no mechanism to check if it is valid and TableGen will raise an error if failed to cast. Besides, we have no semantic null in TableGen (we have `?` but different backends handle uninitialized value differently), so operator like `dyn_cast<>` is hard to implement. In this patch, we add a new operator `!exists<T>(s)` to check whether a record with type `T` and name `s` exists. Self-references are allowed just like `!cast`. By doing these, we can write code like: ``` class dyn_cast_to_record<string name> { R value = !if(!exists<R>(name), !cast<R>(name), default_value); } defvar v = dyn_cast_to_record<"R0">.value; // R0 or default_value. ``` Reviewed By: tra, nhaehnle Differential Revision: https://reviews.llvm.org/D127948	2022-06-23 11:11:47 +08:00
Tom Stellard	7dbb366129	HowToReleaseLLVM: Add description of the bug triage process Reviewed By: andreil99 Differential Revision: https://reviews.llvm.org/D126985	2022-06-21 22:18:35 -07:00
Kristof Beyls	6cb076783e	[docs] More clearly document that the CoC applies to online sync-ups and office hours. * Also removes the code of conduct document listed as a "proposal". Fixes #55430 Differential Revision: https://reviews.llvm.org/D126954	2022-06-20 13:47:53 +02:00
Chris Bieneman	d5745d0015	[docs] Adding table of object file formats The added section and table here list the object file formats LLVM MC supports and which targets support each format. Differential Revision: https://reviews.llvm.org/D127645	2022-06-17 13:37:52 -05:00
Chris Bieneman	5b77a45c7f	[docs] Adding DirectX target usage doc This document is a work in progress to begin fleshing out documentation for the DirectX backend and related changes in the LLVM project. This is not intended to be exhaustive or complete, it is intended as a starting point so taht future changes have a place for documentation to land. Differential Revision: https://reviews.llvm.org/D127640	2022-06-17 13:34:25 -05:00
Arthur Eubanks	05704e785a	[docs] Fix typo	2022-06-17 11:29:07 -07:00
Phoebe Wang	655ba9c8a1	Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""" This resolves problems reported in commit `1a20252978`. 1. Promote to float lowering for nodes XINT_TO_FP 2. Bail out f16 from shuffle combine due to vector type is not legal in the version	2022-06-17 21:34:05 +08:00
Benjamin Kramer	1a20252978	Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""" This reverts commit `04a3d5f3a1`. I see two more issues: - uitofp/sitofp from i32/i64 to half now generates __floatsihf/__floatdihf, which exists in neither compiler-rt nor libgcc - This crashes when legalizing the bitcast: ``` ; RUN: llc < %s -mcpu=skx define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr { entry: %fusion = load ptr, ptr %buffer_table, align 8 %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1 %Arg_1.2 = load ptr, ptr %0, align 8 %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2 %Arg_0.1 = load ptr, ptr %1, align 8 %2 = load half, ptr %Arg_0.1, align 8 %3 = bitcast half %2 to i16 %4 = and i16 %3, 32767 %5 = icmp eq i16 %4, 0 %6 = and i16 %3, -32768 %broadcast.splatinsert = insertelement <4 x half> poison, half %2, i64 0 %broadcast.splat = shufflevector <4 x half> %broadcast.splatinsert, <4 x half> poison, <4 x i32> zeroinitializer %broadcast.splatinsert9 = insertelement <4 x i16> poison, i16 %4, i64 0 %broadcast.splat10 = shufflevector <4 x i16> %broadcast.splatinsert9, <4 x i16> poison, <4 x i32> zeroinitializer %broadcast.splatinsert11 = insertelement <4 x i16> poison, i16 %6, i64 0 %broadcast.splat12 = shufflevector <4 x i16> %broadcast.splatinsert11, <4 x i16> poison, <4 x i32> zeroinitializer %broadcast.splatinsert13 = insertelement <4 x i16> poison, i16 %3, i64 0 %broadcast.splat14 = shufflevector <4 x i16> %broadcast.splatinsert13, <4 x i16> poison, <4 x i32> zeroinitializer %wide.load = load <4 x half>, ptr %Arg_1.2, align 8 %7 = fcmp uno <4 x half> %broadcast.splat, %wide.load %8 = fcmp oeq <4 x half> %broadcast.splat, %wide.load %9 = bitcast <4 x half> %wide.load to <4 x i16> %10 = and <4 x i16> %9, <i16 32767, i16 32767, i16 32767, i16 32767> %11 = icmp eq <4 x i16> %10, zeroinitializer %12 = and <4 x i16> %9, <i16 -32768, i16 -32768, i16 -32768, i16 -32768> %13 = or <4 x i16> %12, <i16 1, i16 1, i16 1, i16 1> %14 = select <4 x i1> %11, <4 x i16> %9, <4 x i16> %13 %15 = icmp ugt <4 x i16> %broadcast.splat10, %10 %16 = icmp ne <4 x i16> %broadcast.splat12, %12 %17 = or <4 x i1> %15, %16 %18 = select <4 x i1> %17, <4 x i16> <i16 -1, i16 -1, i16 -1, i16 -1>, <4 x i16> <i16 1, i16 1, i16 1, i16 1> %19 = add <4 x i16> %18, %broadcast.splat14 %20 = select i1 %5, <4 x i16> %14, <4 x i16> %19 %21 = select <4 x i1> %8, <4 x i16> %9, <4 x i16> %20 %22 = bitcast <4 x i16> %21 to <4 x half> %23 = select <4 x i1> %7, <4 x half> <half 0xH7E00, half 0xH7E00, half 0xH7E00, half 0xH7E00>, <4 x half> %22 store <4 x half> %23, ptr %fusion, align 16 ret void } ``` llc: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:977: void (anonymous namespace)::SelectionDAGLegalize::LegalizeOp(llvm::SDNode ): Assertion `(TLI.getTypeAction(DAG.getContext(), Op.getValueType()) == TargetLowering::TypeLegal \|\| Op.getOpcode() == ISD::TargetConstant \|\| Op.getOpcode() == ISD::Register) && "Unexpected illegal type!"' failed.	2022-06-17 09:43:07 +02:00
Phoebe Wang	04a3d5f3a1	Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""" Fix the crash on lowering X86ISD::FCMP.	2022-06-17 12:12:17 +08:00
Arthur Eubanks	47bfc365fc	[docs][OpaquePtr] Add detail to motivations behind opaque pointers Reviewed By: #opaque-pointers, rnk, nikic Differential Revision: https://reviews.llvm.org/D126309	2022-06-16 10:17:09 -07:00
Diana Picus	24b98520e2	Update FileCheck docs after D95849. NFCI The default has been false for quite a while now. Differential Revision: https://reviews.llvm.org/D127846	2022-06-16 08:18:12 +00:00
Frederik Gossen	3cd5696a33	Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""" This reverts commit `e1c5afa47d`. This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is below. Let me know if you have trouble reproducing this. ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [4 x i8] c"\00\00\00?" @1 = private unnamed_addr constant [4 x i8] c"\1C}\908" @2 = private unnamed_addr constant [4 x i8] c"?\00\\4" @3 = private unnamed_addr constant [4 x i8] c"%ci1" @4 = private unnamed_addr constant [4 x i8] zeroinitializer @5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0" @6 = private unnamed_addr constant [4 x i8] c"\00\00\00B" @7 = private unnamed_addr constant [4 x i8] c"\94\B4\C22" @8 = private unnamed_addr constant [4 x i8] c"^\09B6" @9 = private unnamed_addr constant [4 x i8] c"\15\F3M?" @10 = private unnamed_addr constant [4 x i8] c"e\CC\\;" @11 = private unnamed_addr constant [4 x i8] c"d\BD/>" @12 = private unnamed_addr constant [4 x i8] c"V\F4I=" @13 = private unnamed_addr constant [4 x i8] c"\10\CB,<" @14 = private unnamed_addr constant [4 x i8] c"\AC\E3\D6:" @15 = private unnamed_addr constant [4 x i8] c"\DC\A8E9" @16 = private unnamed_addr constant [4 x i8] c"\C6\FA\897" @17 = private unnamed_addr constant [4 x i8] c"%\F9\955" @18 = private unnamed_addr constant [4 x i8] c"\B5\DB\813" @19 = private unnamed_addr constant [4 x i8] c"\B4W_\B2" @20 = private unnamed_addr constant [4 x i8] c"\1Cc\8F\B4" @21 = private unnamed_addr constant [4 x i8] c"~3\94\B6" @22 = private unnamed_addr constant [4 x i8] c"3Yq\B8" @23 = private unnamed_addr constant [4 x i8] c"\E9\17\17\BA" @24 = private unnamed_addr constant [4 x i8] c"\F1\B2\8D\BB" @25 = private unnamed_addr constant [4 x i8] c"\F8t\C2\BC" @26 = private unnamed_addr constant [4 x i8] c"\82[\C2\BD" @27 = private unnamed_addr constant [4 x i8] c"uB-?" @28 = private unnamed_addr constant [4 x i8] c"^\FF\9B\BE" @29 = private unnamed_addr constant [4 x i8] c"\00\00\00A" ; Function Attrs: uwtable define void @main.158(ptr %retval, ptr noalias %run_options, ptr noalias %params, ptr noalias %buffer_table, ptr noalias %status, ptr noalias %prof_counters) #0 { entry: %fusion.invar_address.dim.1 = alloca i64, align 8 %fusion.invar_address.dim.0 = alloca i64, align 8 %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1 %Arg_0.1 = load ptr, ptr %0, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 0 %fusion = load ptr, ptr %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 store i64 0, ptr %fusion.invar_address.dim.0, align 8 br label %fusion.loop_header.dim.0 return: ; preds = %fusion.loop_exit.dim.0 ret void fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %entry %fusion.indvar.dim.0 = load i64, ptr %fusion.invar_address.dim.0, align 8 %2 = icmp uge i64 %fusion.indvar.dim.0, 3 br i1 %2, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0 fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0 store i64 0, ptr %fusion.invar_address.dim.1, align 8 br label %fusion.loop_header.dim.1 fusion.loop_header.dim.1: ; preds = %fusion.loop_body.dim.1, %fusion.loop_body.dim.0 %fusion.indvar.dim.1 = load i64, ptr %fusion.invar_address.dim.1, align 8 %3 = icmp uge i64 %fusion.indvar.dim.1, 1 br i1 %3, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1 fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1 %4 = getelementptr inbounds [3 x [1 x half]], ptr %Arg_0.1, i64 0, i64 %fusion.indvar.dim.0, i64 0 %5 = load half, ptr %4, align 2, !invariant.load !0, !noalias !3 %6 = fpext half %5 to float %7 = call float @llvm.fabs.f32(float %6) %constant.121 = load float, ptr @29, align 4 %compare.2 = fcmp ole float %7, %constant.121 %8 = zext i1 %compare.2 to i8 %constant.120 = load float, ptr @0, align 4 %multiply.95 = fmul float %7, %constant.120 %constant.119 = load float, ptr @5, align 4 %add.82 = fadd float %multiply.95, %constant.119 %constant.118 = load float, ptr @4, align 4 %multiply.94 = fmul float %add.82, %constant.118 %constant.117 = load float, ptr @19, align 4 %add.81 = fadd float %multiply.94, %constant.117 %multiply.92 = fmul float %add.82, %add.81 %constant.116 = load float, ptr @18, align 4 %add.79 = fadd float %multiply.92, %constant.116 %multiply.91 = fmul float %add.82, %add.79 %subtract.87 = fsub float %multiply.91, %add.81 %constant.115 = load float, ptr @20, align 4 %add.78 = fadd float %subtract.87, %constant.115 %multiply.89 = fmul float %add.82, %add.78 %subtract.86 = fsub float %multiply.89, %add.79 %constant.114 = load float, ptr @17, align 4 %add.76 = fadd float %subtract.86, %constant.114 %multiply.88 = fmul float %add.82, %add.76 %subtract.84 = fsub float %multiply.88, %add.78 %constant.113 = load float, ptr @21, align 4 %add.75 = fadd float %subtract.84, %constant.113 %multiply.86 = fmul float %add.82, %add.75 %subtract.83 = fsub float %multiply.86, %add.76 %constant.112 = load float, ptr @16, align 4 %add.73 = fadd float %subtract.83, %constant.112 %multiply.85 = fmul float %add.82, %add.73 %subtract.81 = fsub float %multiply.85, %add.75 %constant.111 = load float, ptr @22, align 4 %add.72 = fadd float %subtract.81, %constant.111 %multiply.83 = fmul float %add.82, %add.72 %subtract.80 = fsub float %multiply.83, %add.73 %constant.110 = load float, ptr @15, align 4 %add.70 = fadd float %subtract.80, %constant.110 %multiply.82 = fmul float %add.82, %add.70 %subtract.78 = fsub float %multiply.82, %add.72 %constant.109 = load float, ptr @23, align 4 %add.69 = fadd float %subtract.78, %constant.109 %multiply.80 = fmul float %add.82, %add.69 %subtract.77 = fsub float %multiply.80, %add.70 %constant.108 = load float, ptr @14, align 4 %add.68 = fadd float %subtract.77, %constant.108 %multiply.79 = fmul float %add.82, %add.68 %subtract.75 = fsub float %multiply.79, %add.69 %constant.107 = load float, ptr @24, align 4 %add.67 = fadd float %subtract.75, %constant.107 %multiply.77 = fmul float %add.82, %add.67 %subtract.74 = fsub float %multiply.77, %add.68 %constant.106 = load float, ptr @13, align 4 %add.66 = fadd float %subtract.74, %constant.106 %multiply.76 = fmul float %add.82, %add.66 %subtract.72 = fsub float %multiply.76, %add.67 %constant.105 = load float, ptr @25, align 4 %add.65 = fadd float %subtract.72, %constant.105 %multiply.74 = fmul float %add.82, %add.65 %subtract.71 = fsub float %multiply.74, %add.66 %constant.104 = load float, ptr @12, align 4 %add.64 = fadd float %subtract.71, %constant.104 %multiply.73 = fmul float %add.82, %add.64 %subtract.69 = fsub float %multiply.73, %add.65 %constant.103 = load float, ptr @26, align 4 %add.63 = fadd float %subtract.69, %constant.103 %multiply.71 = fmul float %add.82, %add.63 %subtract.67 = fsub float %multiply.71, %add.64 %constant.102 = load float, ptr @11, align 4 %add.62 = fadd float %subtract.67, %constant.102 %multiply.70 = fmul float %add.82, %add.62 %subtract.66 = fsub float %multiply.70, %add.63 %constant.101 = load float, ptr @28, align 4 %add.61 = fadd float %subtract.66, %constant.101 %multiply.68 = fmul float %add.82, %add.61 %subtract.65 = fsub float %multiply.68, %add.62 %constant.100 = load float, ptr @27, align 4 %add.60 = fadd float %subtract.65, %constant.100 %subtract.64 = fsub float %add.60, %add.62 %multiply.66 = fmul float %subtract.64, %constant.120 %constant.99 = load float, ptr @6, align 4 %divide.4 = fdiv float %constant.99, %7 %add.59 = fadd float %divide.4, %constant.119 %multiply.65 = fmul float %add.59, %constant.118 %constant.98 = load float, ptr @3, align 4 %add.58 = fadd float %multiply.65, %constant.98 %multiply.64 = fmul float %add.59, %add.58 %constant.97 = load float, ptr @7, align 4 %add.57 = fadd float %multiply.64, %constant.97 %multiply.63 = fmul float %add.59, %add.57 %subtract.63 = fsub float %multiply.63, %add.58 %constant.96 = load float, ptr @2, align 4 %add.56 = fadd float %subtract.63, %constant.96 %multiply.62 = fmul float %add.59, %add.56 %subtract.62 = fsub float %multiply.62, %add.57 %constant.95 = load float, ptr @8, align 4 %add.55 = fadd float %subtract.62, %constant.95 %multiply.61 = fmul float %add.59, %add.55 %subtract.61 = fsub float %multiply.61, %add.56 %constant.94 = load float, ptr @1, align 4 %add.54 = fadd float %subtract.61, %constant.94 %multiply.60 = fmul float %add.59, %add.54 %subtract.60 = fsub float %multiply.60, %add.55 %constant.93 = load float, ptr @10, align 4 %add.53 = fadd float %subtract.60, %constant.93 %multiply.59 = fmul float %add.59, %add.53 %subtract.59 = fsub float %multiply.59, %add.54 %constant.92 = load float, ptr @9, align 4 %add.52 = fadd float %subtract.59, %constant.92 %subtract.58 = fsub float %add.52, %add.54 %multiply.58 = fmul float %subtract.58, %constant.120 %9 = call float @llvm.sqrt.f32(float %7) %10 = fdiv float 1.000000e+00, %9 %multiply.57 = fmul float %multiply.58, %10 %11 = trunc i8 %8 to i1 %12 = select i1 %11, float %multiply.66, float %multiply.57 %13 = fptrunc float %12 to half %14 = getelementptr inbounds [3 x [1 x half]], ptr %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 0 store half %13, ptr %14, align 2, !alias.scope !3 %invar.inc1 = add nuw nsw i64 %fusion.indvar.dim.1, 1 store i64 %invar.inc1, ptr %fusion.invar_address.dim.1, align 8 br label %fusion.loop_header.dim.1 fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1 %invar.inc = add nuw nsw i64 %fusion.indvar.dim.0, 1 store i64 %invar.inc, ptr %fusion.invar_address.dim.0, align 8 br label %fusion.loop_header.dim.0 fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0 br label %return } ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn declare float @llvm.fabs.f32(float %0) #1 ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn declare float @llvm.sqrt.f32(float %0) #1 attributes #0 = { uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="false" } attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 6} !2 = !{i64 8} !3 = !{!4} !4 = !{!"buffer: {index:0, offset:0, size:6}", !5} !5 = !{!"XLA global AA domain"}	2022-06-15 18:04:42 -04:00
Phoebe Wang	e1c5afa47d	Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"" Fixed the missing SQRT promotion. Adding several missing operations too.	2022-06-15 23:00:18 +08:00
Thomas Joerg	37455b1f71	Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"" This reverts commit `6e02e27536`. This introduces a crash in the backend. Reproducer in MLIR's LLVM dialect follows. Let me know if you have trouble reproducing this. module { llvm.func @malloc(i64) -> !llvm.ptr<i8> llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>) llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00") llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8> llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} { %0 = llvm.mlir.constant(8 : i32) : i32 %1 = llvm.mlir.constant(8 : index) : i64 %2 = llvm.mlir.constant(2 : index) : i64 %3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16> %4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32> %5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16> %6 = llvm.mlir.constant(false) : i1 %7 = llvm.mlir.constant(1 : i32) : i32 %8 = llvm.mlir.constant(0 : i32) : i32 %9 = llvm.mlir.constant(4 : index) : i64 %10 = llvm.mlir.constant(0 : index) : i64 %11 = llvm.mlir.constant(1 : index) : i64 %12 = llvm.mlir.constant(-1 : index) : i64 %13 = llvm.mlir.null : !llvm.ptr<f16> %14 = llvm.getelementptr %13[%9] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %15 = llvm.ptrtoint %14 : !llvm.ptr<f16> to i64 %16 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16> %17 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16> %18 = llvm.mlir.null : !llvm.ptr<i64> %19 = llvm.getelementptr %18[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> %20 = llvm.ptrtoint %19 : !llvm.ptr<i64> to i64 %21 = llvm.alloca %20 x i64 : (i64) -> !llvm.ptr<i64> llvm.br ^bb1(%10 : i64) ^bb1(%22: i64): // 2 preds: ^bb0, ^bb2 %23 = llvm.icmp "slt" %22, %arg1 : i64 llvm.cond_br %23, ^bb2, ^bb3 ^bb2: // pred: ^bb1 %24 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>> %25 = llvm.getelementptr %24[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64> %26 = llvm.add %22, %11 : i64 %27 = llvm.getelementptr %25[%26] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> %28 = llvm.load %27 : !llvm.ptr<i64> %29 = llvm.getelementptr %21[%22] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> llvm.store %28, %29 : !llvm.ptr<i64> llvm.br ^bb1(%26 : i64) ^bb3: // pred: ^bb1 llvm.br ^bb4(%10, %11 : i64, i64) ^bb4(%30: i64, %31: i64): // 2 preds: ^bb3, ^bb5 %32 = llvm.icmp "slt" %30, %arg1 : i64 llvm.cond_br %32, ^bb5, ^bb6 ^bb5: // pred: ^bb4 %33 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>> %34 = llvm.getelementptr %33[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64> %35 = llvm.add %30, %11 : i64 %36 = llvm.getelementptr %34[%35] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> %37 = llvm.load %36 : !llvm.ptr<i64> %38 = llvm.mul %37, %31 : i64 llvm.br ^bb4(%35, %38 : i64, i64) ^bb6: // pred: ^bb4 %39 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>> %40 = llvm.getelementptr %39[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>> %41 = llvm.load %40 : !llvm.ptr<ptr<f16>> %42 = llvm.getelementptr %13[%11] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %43 = llvm.ptrtoint %42 : !llvm.ptr<f16> to i64 %44 = llvm.alloca %7 x i32 : (i32) -> !llvm.ptr<i32> llvm.store %8, %44 : !llvm.ptr<i32> %45 = llvm.call @_mlir_ciface_tf_alloc(%arg0, %31, %43, %8, %7, %44) : (!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8> %46 = llvm.bitcast %45 : !llvm.ptr<i8> to !llvm.ptr<f16> %47 = llvm.icmp "eq" %31, %10 : i64 %48 = llvm.or %6, %47 : i1 %49 = llvm.mlir.null : !llvm.ptr<i8> %50 = llvm.icmp "ne" %45, %49 : !llvm.ptr<i8> %51 = llvm.or %50, %48 : i1 llvm.cond_br %51, ^bb7, ^bb13 ^bb7: // pred: ^bb6 %52 = llvm.urem %31, %9 : i64 %53 = llvm.sub %31, %52 : i64 llvm.br ^bb8(%10 : i64) ^bb8(%54: i64): // 2 preds: ^bb7, ^bb9 %55 = llvm.icmp "slt" %54, %53 : i64 llvm.cond_br %55, ^bb9, ^bb10 ^bb9: // pred: ^bb8 %56 = llvm.mul %54, %11 : i64 %57 = llvm.add %56, %10 : i64 %58 = llvm.add %57, %10 : i64 %59 = llvm.getelementptr %41[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %60 = llvm.bitcast %59 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> %61 = llvm.load %60 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>> %62 = "llvm.intr.sqrt"(%61) : (vector<4xf16>) -> vector<4xf16> %63 = llvm.fdiv %5, %62 : vector<4xf16> %64 = llvm.getelementptr %46[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %65 = llvm.bitcast %64 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> llvm.store %63, %65 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>> %66 = llvm.add %54, %9 : i64 llvm.br ^bb8(%66 : i64) ^bb10: // pred: ^bb8 %67 = llvm.icmp "ult" %53, %31 : i64 llvm.cond_br %67, ^bb11, ^bb12 ^bb11: // pred: ^bb10 %68 = llvm.mul %53, %12 : i64 %69 = llvm.add %31, %68 : i64 %70 = llvm.mul %53, %11 : i64 %71 = llvm.add %70, %10 : i64 %72 = llvm.trunc %69 : i64 to i32 %73 = llvm.mlir.undef : vector<4xi32> %74 = llvm.insertelement %72, %73[%8 : i32] : vector<4xi32> %75 = llvm.shufflevector %74, %73 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xi32>, vector<4xi32> %76 = llvm.icmp "slt" %4, %75 : vector<4xi32> %77 = llvm.add %71, %10 : i64 %78 = llvm.getelementptr %41[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %79 = llvm.bitcast %78 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> %80 = llvm.intr.masked.load %79, %76, %3 {alignment = 2 : i32} : (!llvm.ptr<vector<4xf16>>, vector<4xi1>, vector<4xf16>) -> vector<4xf16> %81 = llvm.bitcast %16 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> llvm.store %80, %81 : !llvm.ptr<vector<4xf16>> %82 = llvm.load %81 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>> %83 = "llvm.intr.sqrt"(%82) : (vector<4xf16>) -> vector<4xf16> %84 = llvm.fdiv %5, %83 : vector<4xf16> %85 = llvm.bitcast %17 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> llvm.store %84, %85 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>> %86 = llvm.load %85 : !llvm.ptr<vector<4xf16>> %87 = llvm.getelementptr %46[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16> %88 = llvm.bitcast %87 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>> llvm.intr.masked.store %86, %88, %76 {alignment = 2 : i32} : vector<4xf16>, vector<4xi1> into !llvm.ptr<vector<4xf16>> llvm.br ^bb12 ^bb12: // 2 preds: ^bb10, ^bb11 %89 = llvm.mul %2, %1 : i64 %90 = llvm.mul %arg1, %2 : i64 %91 = llvm.add %90, %11 : i64 %92 = llvm.mul %91, %1 : i64 %93 = llvm.add %89, %92 : i64 %94 = llvm.alloca %93 x i8 : (i64) -> !llvm.ptr<i8> %95 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>> llvm.store %46, %95 : !llvm.ptr<ptr<f16>> %96 = llvm.getelementptr %95[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>> llvm.store %46, %96 : !llvm.ptr<ptr<f16>> %97 = llvm.getelementptr %95[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>> %98 = llvm.bitcast %97 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64> llvm.store %10, %98 : !llvm.ptr<i64> %99 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>> %100 = llvm.getelementptr %99[%10, 3] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>, i64) -> !llvm.ptr<i64> %101 = llvm.getelementptr %100[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> %102 = llvm.sub %arg1, %11 : i64 llvm.br ^bb14(%102, %11 : i64, i64) ^bb13: // pred: ^bb6 %103 = llvm.mlir.addressof @error_message_2208944672953921889 : !llvm.ptr<array<42 x i8>> %104 = llvm.getelementptr %103[%10, %10] : (!llvm.ptr<array<42 x i8>>, i64, i64) -> !llvm.ptr<i8> llvm.call @_mlir_ciface_tf_report_error(%arg0, %0, %104) : (!llvm.ptr<i8>, i32, !llvm.ptr<i8>) -> () %105 = llvm.mul %2, %1 : i64 %106 = llvm.mul %2, %10 : i64 %107 = llvm.add %106, %11 : i64 %108 = llvm.mul %107, %1 : i64 %109 = llvm.add %105, %108 : i64 %110 = llvm.alloca %109 x i8 : (i64) -> !llvm.ptr<i8> %111 = llvm.bitcast %110 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>> llvm.store %13, %111 : !llvm.ptr<ptr<f16>> %112 = llvm.getelementptr %111[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>> llvm.store %13, %112 : !llvm.ptr<ptr<f16>> %113 = llvm.getelementptr %111[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>> %114 = llvm.bitcast %113 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64> llvm.store %10, %114 : !llvm.ptr<i64> %115 = llvm.call @malloc(%109) : (i64) -> !llvm.ptr<i8> "llvm.intr.memcpy"(%115, %110, %109, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> () %116 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)> %117 = llvm.insertvalue %10, %116[0] : !llvm.struct<(i64, ptr<i8>)> %118 = llvm.insertvalue %115, %117[1] : !llvm.struct<(i64, ptr<i8>)> llvm.return %118 : !llvm.struct<(i64, ptr<i8>)> ^bb14(%119: i64, %120: i64): // 2 preds: ^bb12, ^bb15 %121 = llvm.icmp "sge" %119, %10 : i64 llvm.cond_br %121, ^bb15, ^bb16 ^bb15: // pred: ^bb14 %122 = llvm.getelementptr %21[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> %123 = llvm.load %122 : !llvm.ptr<i64> %124 = llvm.getelementptr %100[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> llvm.store %123, %124 : !llvm.ptr<i64> %125 = llvm.getelementptr %101[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64> llvm.store %120, %125 : !llvm.ptr<i64> %126 = llvm.mul %120, %123 : i64 %127 = llvm.sub %119, %11 : i64 llvm.br ^bb14(%127, %126 : i64, i64) ^bb16: // pred: ^bb14 %128 = llvm.call @malloc(%93) : (i64) -> !llvm.ptr<i8> "llvm.intr.memcpy"(%128, %94, %93, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> () %129 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)> %130 = llvm.insertvalue %arg1, %129[0] : !llvm.struct<(i64, ptr<i8>)> %131 = llvm.insertvalue %128, %130[1] : !llvm.struct<(i64, ptr<i8>)> llvm.return %131 : !llvm.struct<(i64, ptr<i8>)> } llvm.func @_mlir_ciface_Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<struct<(i64, ptr<i8>)>>, %arg1: !llvm.ptr<i8>, %arg2: !llvm.ptr<struct<(i64, ptr<i8>)>>) attributes {llvm.emit_c_interface, tf_entry} { %0 = llvm.load %arg2 : !llvm.ptr<struct<(i64, ptr<i8>)>> %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)> %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)> %3 = llvm.call @Rsqrt_CPU_DT_HALF_DT_HALF(%arg1, %1, %2) : (!llvm.ptr<i8>, i64, !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> llvm.store %3, %arg0 : !llvm.ptr<struct<(i64, ptr<i8>)>> llvm.return } }	2022-06-15 13:24:24 +02:00
Phoebe Wang	6e02e27536	Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI" Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see the issue here https://github.com/llvm/llvm-project/issues/55992	2022-06-15 09:15:31 +08:00
Jonas Devlieghere	33b6891db2	[dsymutil] Automatically generate a reproducer when dsymutil crashes Automatically generate a reproducer when dsymutil crashes. We already support generating reproducers with the --gen-reproducer flag, which emits a reproducer on exit. This patch adds support for doing the same on a crash and makes it the default behavior. rdar://68357665 Differential revision: https://reviews.llvm.org/D127441	2022-06-14 16:00:08 -07:00
Chuanqi Xu	735e6c40b5	[Coroutines] Convert coroutine.presplit to enum attr This is required by @nikic in https://reviews.llvm.org/D127383 to decrease the cost to check whether a function is a coroutine and this fixes a FIXME too. Reviewed By: rjmccall, ezhulenev Differential Revision: https://reviews.llvm.org/D127471	2022-06-14 14:23:46 +08:00
Mehdi Amini	5d8298a768	Revert "[X86][RFC] Enable `_Float16` type support on X86 following the psABI" This reverts commit `2d2da259c8`. This breaks MLIR integration test (JIT crashing), reverting in the meantime.	2022-06-12 15:14:37 +00:00
Phoebe Wang	2d2da259c8	[X86][RFC] Enable `_Float16` type support on X86 following the psABI GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following the latest X86 psABI. (https://gitlab.com/x86-psABIs) _Float16 arithmetic will be performed using native half-precision. If native arithmetic instructions are not available, it will be performed at a higher precision (currently always float) and then truncated down to _Float16 immediately after each single arithmetic operation. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D107082	2022-06-12 11:40:00 +08:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Mitch Phillips	35b1a64589	Add documentation of new sanitizer-specific GV attributes. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D126922	2022-06-10 12:46:02 -07:00
Jay Foad	044b8f4bc8	[AMDGPU] Restore documentation of .amdhsa_shared_vgpr_count This was accidentally lost in D127402.	2022-06-10 17:06:08 +01:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
Jay Foad	b0a3849439	[AMDGPU] Update dlc usage for GFX11 In GFX10 dlc controlled L1 cache bypass. In GFX11 it has been repurposed to control MALL NOALLOC, and glc controls L1 as well as L0 cache bypass. Update the documentation and SIMemoryLegalizer accordingly. Set dlc for nontemporal and volatile accesses. Differential Revision: https://reviews.llvm.org/D127405	2022-06-10 08:10:34 +01:00
Tony	802e3f4f57	[AMDGPU] Add GFX11 documentation to AMDGPUUsage Update most of the document to include GFX11. Memory model changes will come later. Differential Revision: https://reviews.llvm.org/D127402	2022-06-10 08:10:34 +01:00
Nikita Popov	10ac235b07	[Docs] Add version support information for opaque pointers (NFC) I've seen a few people try to enable opaque pointers with LLVM 14 already. While LLVM 14 has pretty good baseline support, there are enough missing pieces that you're definitely going to hit assertion failures if you try this. Add some wording to make it clear what the support (or planned support) for opaque/typed pointers is across LLVM 14, 15, and 16.	2022-06-08 11:54:34 +02:00
Martin Storsjö	20ca739701	[doc] Add release notes about SEH unwind information on ARM Differential Revision: https://reviews.llvm.org/D127150	2022-06-08 11:32:17 +03:00
Nathan Lanza	9b3c5cba9f	Update the ProgrammersManual explanation for ilist and iplist They are now `using` aliases and thus the comments about iplist are now incorrect. Remove them here. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D95210	2022-06-07 22:50:04 -04:00
J. Ryan Stinnett	b878245af9	[DebugInfo][Docs] Improve code formatting in instruction referencing doc This adds code blocks and inline code formatting to improve the readability of the instruction referencing doc. Reviewed By: Orlando Differential Revision: https://reviews.llvm.org/D126767	2022-06-07 13:18:12 +02:00
Shilei Tian	0c3e6e5717	[NFC] Remove trailing whitespace	2022-06-06 18:59:13 -04:00
Fangrui Song	f9e9037c86	[docs] Fix style and typo in HowToSetUpLLVMStyleRTTI.rst after D126943	2022-06-06 12:41:21 -07:00
Dmitry Preobrazhensky	4fed5f174f	[AMDGPU][GFX8][DOC][NFC] Update assembler syntax description Summary of changes: - Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Enabled literals with src0 for v_madak, v_madmk (see https://reviews.llvm.org/D111067). - Minor bug fixing.	2022-06-06 17:42:16 +03:00
Dmitry Preobrazhensky	9c7e803f2d	[AMDGPU][GFX7][DOC][NFC] Update assembler syntax description Summary of changes: - Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485). - Enabled literals with src0 of v_madak_f32, v_madmk_f32 (see https://reviews.llvm.org/D111067). - Corrected LGKM_CNT description. - Minor bug fixing.	2022-06-06 15:50:10 +03:00
Fangrui Song	d0d1c416cb	Remove unneeded cl::ZeroOrMore for cl::list options	2022-06-04 23:51:13 -07:00
Yuki Okushi	fa7b4cf05e	[docs] Remove a link to an outdated Go docs That link returns 404, we have bindings code on https://github.com/llvm/llvm-project/tree/main/llvm/bindings/go but it seems we haven't published it and there are no docs yet. Differential Revision: https://reviews.llvm.org/D126874	2022-06-03 23:50:35 +09:00
Kristof Beyls	8b18572ea7	[docs] Fix RST code-block syntax in HowToSetUpLLVMStyleRTTI.rst	2022-06-03 11:24:49 +02:00
bzcheeseman	47231248f5	[LLVM][Docs] Update for HowToSetUpLLVMStyleRTTI.rst, NFC. This patch updates the document with some advanced use cases and examples on how to set up and use LLVM-style RTTI. It includes a few motivating examples to get readers comfortable with the concepts. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D126943	2022-06-02 22:34:38 -07:00
Fangrui Song	dfa9221aa7	[docs] Mention LLVMContext::setOpaquePointers for C++ API	2022-06-02 13:28:42 -07:00
Guillaume Chatelet	53efdf33f8	Fix llvm.memset semantics description The description was referring to a ``src`` parameter probably copied over from ``llvm.memcpy``	2022-06-02 13:25:03 +02:00
Nikita Popov	b0ce6a0ae5	[Docs] Update default in opaque pointer docs (NFC) Also mention a relevant C API.	2022-06-02 12:29:07 +02:00
Martin Storsjö	668bb96379	[ARM] Implement lowering of the sponentry intrinsic This is needed for SEH based setjmp on Windows. Differential Revision: https://reviews.llvm.org/D126763	2022-06-02 12:29:59 +03:00
Nikita Popov	41d5033eb1	[IR] Enable opaque pointers by default This enabled opaque pointers by default in LLVM. The effect of this is twofold: * If IR that contains neither explicit ptr nor %T* types is passed to tools, we will now use opaque pointer mode, unless -opaque-pointers=0 has been explicitly passed. * Users of LLVM as a library will now default to opaque pointers. It is possible to opt-out by calling setOpaquePointers(false) on LLVMContext. A cmake option to toggle this default will not be provided. Frontends or other tools that want to (temporarily) keep using typed pointers should disable opaque pointers via LLVMContext. Differential Revision: https://reviews.llvm.org/D126689	2022-06-02 09:40:56 +02:00
Matt Arsenault	09a539e926	AMDGPU: Add release notes about atomic load and store	2022-06-01 21:14:48 -04:00
Matthias Braun	850d53a197	LTO: Decide upfront whether to use opaque/non-opaque pointer types LTO code may end up mixing bitcode files from various sources varying in their use of opaque pointer types. The current strategy to decide between opaque / typed pointers upon the first bitcode file loaded does not work here, since we could be loading a non-opaque bitcode file first and would then be unable to load any files with opaque pointer types later. So for LTO this: - Adds an `lto::Config::OpaquePointer` option and enforces an upfront decision between the two modes. - Adds `-opaque-pointers`/`-no-opaque-pointers` options to the gold plugin; disabled by default. - `--opaque-pointers`/`--no-opaque-pointers` options with `-plugin-opt=-opaque-pointers`/`-plugin-opt=-no-opaque-pointers` aliases to lld; disabled by default. - Adds an `-lto-opaque-pointers` option to the `llvm-lto2` tool. - Changes the clang driver to pass `-plugin-opt=-opaque-pointers` to the linker in LTO modes when clang was configured with opaque pointers enabled by default. This fixes https://github.com/llvm/llvm-project/issues/55377 Differential Revision: https://reviews.llvm.org/D125847	2022-06-01 18:05:53 -07:00
owenca	3d56131bf6	[Docs] Clarify the guideline on omitting braces While working on a clang-format option RemoveBracesLLVM that removes braces following the guideline, we were unsure about what to do with the braces of do-while loops. The ratio of using to omitting the braces is about 4:1 in the llvm-project source, so it will help to add an example to the guideline. Also cleans up the original examples including making the nested if example more targeted on avoiding potential dangling else situations. Differential Revision: https://reviews.llvm.org/D126512	2022-05-31 23:35:30 -07:00
Augie Fackler	b0a1a308f2	LangRef: fix bad indentation in allockind bullets	2022-05-31 11:06:43 -04:00
Augie Fackler	42861faa8e	attributes: introduce allockind attr for describing allocator fn behavior I chose to encode the allockind information in a string constant because otherwise we would get a bit of an explosion of keywords to deal with the possible permutations of allocation function types. I'm not sure that CodeGen.h is the correct place for this enum, but it seemed to kind of match the UWTableKind enum so I put it in the same place. Constructive suggestions on a better location most certainly encouraged. Differential Revision: https://reviews.llvm.org/D123088	2022-05-31 10:01:17 -04:00
Dmitry Preobrazhensky	62c46093f1	[AMDGPU][DOC][NFC] Add GFX90C and GFX940 assembler syntax description	2022-05-31 14:29:06 +03:00
Lian Wang	967ef4ad0a	[NFC][VP] Fix llvm.vp.merge intrinsic Expansion in LangRef Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D126457	2022-05-30 01:43:41 +00:00
Yuki Okushi	0a2d2eed43	[docs] Update the label name for new contributors The `beginner` label is deprecated and the `good first issue` label is now preferred. Differential Revision: https://reviews.llvm.org/D126526	2022-05-28 23:14:31 +09:00
Anastasia Stulova	7df25978ef	[Doc][OpenCL] Misc wording improvements for SPIR-V	2022-05-27 11:13:06 +01:00
Serge Pavlov	bdd0093f4d	[GlobalISel] Add G_IS_FPCLASS Add a generic opcode to represent `llvm.is_fpclass` intrinsic. Differential Revision: https://reviews.llvm.org/D121454	2022-05-27 13:49:47 +07:00
Sebastian Peryt	d1c5da34a7	[DOC] Improve LangRef description of declare This patch fixes formatting inside Functions section of declare by making it consistent with the way how define is written. Fixes #39844 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D125581	2022-05-26 14:34:34 -07:00
Sebastian Peryt	0f64945352	[DOC] Refactor Functions section in LangRef This change is a small refactor of Functions section to update placement of define syntax. Reviewed By: RKSimon Differential revision: https://reviews.llvm.org/D125831	2022-05-26 14:34:34 -07:00
Takafumi Arakaki	18e6b8234a	Allow pointer types for atomicrmw xchg This adds support for pointer types for `atomic xchg` and let us write instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This is similar to the patch for allowing atomicrmw xchg on floating point types: https://reviews.llvm.org/D52416. Differential Revision: https://reviews.llvm.org/D124728	2022-05-25 16:20:26 +00:00
Ivan Kosarev	046f901735	[TableGen] Undeprecate 'field' when used with the CodeEmitterGen backend. Differential Revision: https://reviews.llvm.org/D126290	2022-05-25 15:15:19 +01:00
Kristof Beyls	8d29187506	Minutes for security group sync-ups have moved to Discourse.	2022-05-24 13:46:08 +02:00
Fangrui Song	224a8653c9	[llvm-nm][docs] Document -W and -U Latest GNU nm (milestone: 2.39) has added -W/--no-weak and changed -U to mean --defined-only (instead of --unicode=). The changes match our semantics. Close #55297 Reviewed by: jhenderson, keith Differential Revision: https://reviews.llvm.org/D126133	2022-05-23 09:58:54 -07:00

1 2 3 4 5 ...

9578 Commits