llvm-project

Commit Graph

Author	SHA1	Message	Date
Nico Weber	e23c6cc54e	[aarch64/mac] Correctly disassemble @TLVPPAGE(OFF) relocs `llvm-otool -tV foo.o` and `llvm-objdump --macho -d foo.o` would previously fail on object files containing @TLVPPAGE or @TLVPPAGEOFF relocs. Move llvm-objdump-specific test from llvm/test/MC/AArch64/arm64-tls-modifiers-darwin.s to new llvm/test/tools/llvm-objdump/MachO/disassemble-arm64-tlv-modifers.test and put test for this fix to that new file. Fixes PR52356. Differential Revision: https://reviews.llvm.org/D112843	2021-11-10 10:41:18 -05:00
Esme-Yi	ab97ffb96a	Reland [XCOFF][yaml2obj] support for the auxiliary file header. Summary: Fix the build failure on MSVC by making the `T` and `U` of the function 'T llvm::Optional<T>::getValueOr<llvm::yaml::Hex32>(U &&) const &' the same. Differential Revision: https://reviews.llvm.org/D111487	2021-11-10 07:23:56 +00:00
David Blaikie	58b1b6414b	llvm-dwarfdump: Lookup type units when prettyprinting types This handles DWARFv4 and DWARFv5 type units, but not Split DWARF type units. That'll come in a follow-up patch.	2021-11-09 16:58:22 -08:00
Gulfem Savrun Yeniceri	126e7611c7	[compiler-rt] Fix diagnostic in InstrProfError This patch fixes some issues introduced in https://reviews.llvm.org/D108942: 1) Remove the default label to fix the bots that use -Werror,-Wcovered-switch-default 2) Modify the malformed test to fix the bots that are built without zlib support 3) Modify some error messages in malformed profiles	2021-11-09 20:30:03 +00:00
Dwight Guth	16c3db8def	[llvm-reduce] Fix invalid reduction in basic-blocks delta pass Previously, if the basic-blocks delta pass tried to remove a basic block that was the last basic block in a function that did not have external or weak linkage, the resulting IR would become invalid. Since removing the last basic block in a function is effectively identical to removing the function body itself, we check explicitly for this case and if we detect it, we run the same logic as in ReduceFunctionBodies.cpp Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D113486	2021-11-09 10:43:38 -08:00
Dwight Guth	fbfd327fdf	[llvm-reduce] Add flag to start at finer granularity Sometimes if llvm-reduce is interrupted in the middle of a delta pass on a large file, it can take quite some time for the tool to start actually doing new work if it is restarted again on the partially-reduced file. A lot of time ends up being spent testing large chunks when these large chunks are very unlikely to actually pass the interestingness test. In cases like this, the tool will complete faster if the starting granularity is reduced to a finer amount. Thus, we introduce a command line flag that automatically divides the chunks into smaller subsets a fixed, user-specified number of times prior to beginning the core loop. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D112651	2021-11-09 10:14:08 -08:00
Fangrui Song	5f1e509579	[llvm-objdump] -p: Dump PE header for PE/COFF For a trivial DLL built with `clang --target=x86_64-windows -O2 -c a.c; lld-link -subsystem:console -dll a.o -out:a.dll`, `objdump -p` vs `llvm-objdump -p`: ``` -a.dll: file format pei-x86-64 - +a.dll: file format coff-x86-64 Characteristics 0x2022 executable large address aware @@ -57,4 +56,4 @@ Entry d 0000000000000000 00000000 Delay Import Directory Entry e 0000000000000000 00000000 CLR Runtime Header Entry f 0000000000000000 00000000 Reserved - +Export Table: ``` For a Linux image (`vmlinuz-5.10.76-gentoo-r1`) built with `CONFIG_EFI_STUB=y` ``` -vmlinuz-5.10.76-gentoo-r1: file format pei-x86-64 - -Characteristics 0x20e +vmlinuz-5.10.76-gentoo-r1: file format coff-x86-64 +Characteristics 0x206 executable line numbers stripped - symbols stripped debugging information removed Time/Date Wed Dec 31 16:00:00 1969 @@ -55,10 +53,4 @@ Entry d 0000000000000000 00000000 Delay Import Directory Entry e 0000000000000000 00000000 CLR Runtime Header Entry f 0000000000000000 00000000 Reserved - - -PE File Base Relocations (interpreted .reloc section contents) - -Virtual Address: 000037ca Chunk size 10 (0xa) Number of fixups 1 - reloc 0 offset 0 [37ca] ABSOLUTE - +Export Table: ``` `symbols stripped` looks like a GNU objdump problem. Reviewed By: jhenderson, alexander-shaposhnikov Differential Revision: https://reviews.llvm.org/D113356	2021-11-09 10:08:41 -08:00
Gulfem Savrun Yeniceri	ee88b8d63e	[compiler-rt] Add more diagnostic to InstrProfError If profile data is malformed for any kind of reason, we generate an error that only reports "malformed instrumentation profile data" without any further information. This patch extends InstrProfError class to receive an optional error message argument, so that we can do better error reporting. Differential Revision: https://reviews.llvm.org/D108942	2021-11-09 18:04:12 +00:00
Alexey Lapshin	c8ae08987d	[llvm-dwarfdump] dump link to the immediate parent. It is often useful to know which die is the parent of the current die. This patch adds information about parent offset into the dump: 0x0000000b: DW_TAG_compile_unit DW_AT_producer ("by_hand") 0x00000014: DW_TAG_base_type (0x0000000b) <<<<<<<<<<<<<< DW_AT_name ("int") Now it is easy to see which die is the parent of the current die. This patch makes that behaviour to be default. We can make it to be opt-in if neccessary. This functionality differs from already existed "--show-parents" in that sence that parent information is shown for all dies and only link to the immediate parent is shown. Differential Revision: https://reviews.llvm.org/D113406	2021-11-09 14:14:06 +03:00
Simon Pilgrim	32a4a883f6	Revert rGe1eec7601b6988b35ae3cdc8d67cf3cf4e1361dd "[XCOFF][yaml2obj] support for the auxiliary file header." This is failing on MSVC builds: https://lab.llvm.org/buildbot/#/builders/86/builds/23436	2021-11-09 11:02:13 +00:00
Esme-Yi	e1eec7601b	[XCOFF][yaml2obj] support for the auxiliary file header. Summary: This patch adds yaml2obj supporting for the auxiliary file header of XCOFF. Reviewed By: DiggerLin, jhenderson Differential Revision: https://reviews.llvm.org/D111487	2021-11-09 09:48:40 +00:00
Paul Robinson	38be8f4057	Add llvm-tli-checker A new tool that compares TargetLibraryInfo's opinion of the availability of library function calls against the functions actually exported by a specified set of libraries. Can be helpful in verifying the correctness of TLI for a given target, and avoid mishaps such as had to be addressed in D107509 and `94b4598d`. The tool currently supports ELF object files only, although it's unlikely to be hard to add support for other formats. Re-commits `62dd488` with changes to use pre-generated objects, as not all bots have ld.lld available. Differential Revision: https://reviews.llvm.org/D111358	2021-11-08 16:29:28 -08:00
Paul Robinson	1297c21406	Revert "Add llvm-tli-checker" Not all bots have ld.lld available. This reverts commit `62dd488164`.	2021-11-08 15:48:29 -08:00
Paul Robinson	62dd488164	Add llvm-tli-checker A new tool that compares TargetLibraryInfo's opinion of the availability of library function calls against the functions actually exported by a specified set of libraries. Can be helpful in verifying the correctness of TLI for a given target, and avoid mishaps such as had to be addressed in D107509 and `94b4598d`. The tool currently supports ELF object files only, although it's unlikely to be hard to add support for other formats. Differential Revision: https://reviews.llvm.org/D111358	2021-11-08 14:59:13 -08:00
Adrian Prantl	8bd8dd16e2	Extend obj2yaml to optionally preserve raw __LINKEDIT/__DATA segments. I am planning to upstream MachOObjectFile code to support Darwin chained fixups. In order to test the new parser features we need a way to produce correct (and incorrect) chained fixups. Right now the only tool that can produce them is the Darwin linker. To avoid having to check in binary files, this patch allows obj2yaml to print a hexdump of the raw LINKEDIT and DATA segment, which both allows to bootstrap the parser and enables us to easily create malformed inputs to test error handling in the parser. This patch adds two new options to obj2yaml: -raw-data-segment -raw-linkedit-segment Differential Revision: https://reviews.llvm.org/D113234	2021-11-08 11:30:12 -08:00
Zarko Todorovski	c4396b77ae	[LLVM][llvm-cfi] Inclusive language: replace uses of blacklist with ignorelist Replace the description and file names for this argument. As far as I understand this is a positional argument and I don't believe this changes breaks any existing interfaces. Reviewed By: hctim, MaskRay Differential Revision: https://reviews.llvm.org/D113316	2021-11-08 10:05:52 -05:00
Esme-Yi	9b6f264d2b	[XCOFF][llvm-readobj] improve the relocation output. Summary: 1. implemented the unexpanded relocations output. 2. modified the expanded output format to align. Reviewed By: shchenz, jhenderson Differential Revision: https://reviews.llvm.org/D111700	2021-11-08 03:15:52 +00:00
David Blaikie	0a5c26f2ef	DebugInfo: Simplified Template Names: drop unneeded space in arrays Matching a recent clang change I've made, now 'int[3]' is formatted without the space between the type and array bound. This commit updates libDebugInfoDWARF/llvm-dwarfdump to match that formatting.	2021-11-05 22:50:57 -07:00
wlei	5bf191a381	[llvm-profgen] Fix index out of bounds error while using ip.advance Previously we assume there're some non-executing sections at the bottom of the text section so that we won't hit the array's bound. But on BOLTed binary, it turned out .bolt section is at the bottom of text section which can be profiled, then it crash llvm-profgen. This change try to fix it. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D113238	2021-11-05 18:38:40 -07:00
David Blaikie	f57d0e2726	DWARF Simplified Template Names: Narrow down the handling for operator overloads Actually we can, for now, remove the explicit "operator" handling entirely - since clang currently won't try to flag any of these as rebuildable. That seems like a reasonable state for now, but it could be narrowed down to only apply to conversion operators, most likely - but would need more nuance for op> and op>> since they would be incorrectly flagged as already having their template arguments (due to the trailing '>').	2021-11-05 15:41:56 -07:00
Fangrui Song	26a8ceba3e	[llvm-readobj] Display DT_RELRSZ/DT_RELRENT as " (bytes)" to match RELSZ/RELENT. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D113206	2021-11-05 10:02:49 -07:00
gbreynoo	ced9287c2d	[llvm-objdump] Fix the Assertion failure when providing invalid --debug-vars or --dwarf values As seen in https://bugs.llvm.org/show_bug.cgi?id=52213 llvm-objdump asserts if either the --debug-vars or the --dwarf options are provided with invalid values. As suggested, this fix adds use of a default value to these options and errors when given bad input. Differential Revision: https://reviews.llvm.org/D112183	2021-11-04 11:01:32 +00:00
wlei	138202a8c3	[llvm-profgen] Warn on invalid range and show warning summary Two things in this diff: 1) Warn on the invalid range, currently three types of checking, see the detailed message in the code. 2) In some situation, llvm-profgen gives lots of warnings on the truncated stacks which is noisy. This change provides a switch to `--show-detailed-warning` to skip the warnings. Alternatively, we use a summary for those warning and show the percentage of cases with those issues. Example of warning summary. ``` warning: 0.05%(1120/2428958) cases with issue: Profile context truncated due to missing probe for call instruction. warning: 0.00%(2/178637) cases with issue: Range does not belong to any functions, likely from external function. ``` Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D111902	2021-11-02 19:55:55 -07:00
Hongtao Yu	d0eb472f33	[llvm-profdata] Print out section flags for FunctionMetadata section As titled. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D113064	2021-11-02 17:59:22 -07:00
Arthur Eubanks	f54a8759f0	[llvm-reduce] Reduce more GlobalValue properties Reviewed By: hans Differential Revision: https://reviews.llvm.org/D112885	2021-11-02 08:47:41 -07:00
Arthur Eubanks	80ba72b07b	[llvm-reduce] Reduce some GlobalObject properties Specifically, the section and the alignment. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D112884	2021-11-02 08:47:32 -07:00
Frederic Cambus	650311737e	[llvm-readobj] Add support for reading OpenBSD ELF core notes. Notes generated in OpenBSD core files provide additional information about the kernel state and CPU registers. These notes are described in core.5, which can be viewed here: https://man.openbsd.org/core.5 Differential Revision: https://reviews.llvm.org/D111966	2021-11-02 10:18:54 +01:00
Markus Lavin	fd41738e2c	Recommit "[llvm-reduce] Add MIR support" (Second try. Need to link against CodeGen and MC libs.) The llvm-reduce tool has been extended to operate on MIR (import, clone and export). Current limitation is that only a single machine function is supported. A single reducer pass that operates on machine instructions (while on SSA-form) has been added. Additional MIR specific reducer passes can be added later as needed. Differential Revision: https://reviews.llvm.org/D110527	2021-11-02 10:16:42 +01:00
Markus Lavin	aee7f3384b	Revert "[llvm-reduce] Add MIR support" This reverts commit `bc2773cb1b`. Broke the clang-ppc64le-linux-multistage build. Reverting while I investigate.	2021-11-02 09:41:02 +01:00
Markus Lavin	bc2773cb1b	[llvm-reduce] Add MIR support The llvm-reduce tool has been extended to operate on MIR (import, clone and export). Current limitation is that only a single machine function is supported. A single reducer pass that operates on machine instructions (while on SSA-form) has been added. Additional MIR specific reducer passes can be added later as needed. Differential Revision: https://reviews.llvm.org/D110527	2021-11-02 09:14:56 +01:00
wlei	3f3103c6a9	[llvm-profgen] Fill zero count for all function ranges Allow filling zero count for all the function ranges even there is no samples hitting that function. Add a switch for this. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D112858	2021-11-01 09:57:05 -07:00
Esme-Yi	81441cf44c	[XCOFF] [llvm-readobj] replace tests using binary as input with tests generated by yaml2obj. Summary: Because yaml2obj supports basic transforming for XCOFF, some of the binary inputs used in the tests of llvm-readobj can be replaced with yaml files. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D111699	2021-11-01 08:43:32 +00:00
wlei	f5537643b8	[llvm-profgen] Update total samples by accumulating all its body samples Like probe-based profile, the total samples is the sum of all its body samples. This patch fix it by a post-processing update for the line-number based profile. Tested it on our internal services, results showed no performance change. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D112672	2021-10-29 10:36:57 -07:00
wlei	2f8196db92	[llvm-profgen] Fix bug of populating profile symbol list Previous implementation of populating profile symbol list is wrong, it only included the profiled symbols. Actually it should use all symbols, here this switches to use the symbols from debug info. Also turned the flag off by default. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D111824	2021-10-29 09:59:12 -07:00
Arthur Eubanks	177a703710	[llvm-reduce] Actually skip invalid candidates in operands-to-args This was checked while counting but not actually when doing the reduction, resulting in crashes. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D112766	2021-10-29 09:14:18 -07:00
David Blaikie	b65f24a74c	llvm-dwarfdump --verify: Don't diagnose functions in different sections as overlapping Functions in different sections (common in object files - inline functions, -ffunction-sections, etc) can't overlap, so factor in the section when diagnosing overlapping address ranges. This removes a major false-positive when running llvm-dwarfdump on unlinked code.	2021-10-28 17:13:57 -07:00
Hongtao Yu	259e4c5658	[CSSPGO] Trim cold base profiles for the CS preinliner. Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D112489	2021-10-27 22:50:27 -07:00
Djordje Todorovic	40c2bdf6d1	[llvm-locstats] Move the test from D110621 into test/llvm-locstats/ dir	2021-10-27 17:36:19 +02:00
djtodoro	30a3652b6a	[llvm-locstats] Report a warning if overflow was detected by llvm-dwarfdump Catch that llvm-dwarfdump detected an overflow in statistics. Differential Revision: https://reviews.llvm.org/D110621	2021-10-27 14:35:29 +02:00
Nico Weber	3c0cf7e1a9	Unbreak code_signature_lc.test on macOS after `911be05743`	2021-10-26 21:05:48 -04:00
Daniel Rodríguez Troitiño	911be05743	[test][objcopy] Replace GNU sed extension with BSD compatible syntax. GNU sed offers the `,+4d` to delete the line a next four lines, but BSD sed doesn't seem to support it (at least in macOS 10.15, but seems to do in my 11.6 version). Replace the usage of the extension with the equivalent syntax that works both in BSD and GNU sed. I don't have a macOS 10.15 to check, but this works in both my macOS 11.6 and Linux machines. Differential Revision: https://reviews.llvm.org/D112583	2021-10-26 17:35:56 -07:00
David Blaikie	3ac709b6ce	llvm-dwarfdump --verify: Exit non-zero on simplified template name rebuilding failures	2021-10-26 15:57:16 -07:00
Nuri Amari	a299b24712	Regenerate LC_CODE_SIGNATURE during llvm-objcopy operations Context: This is a second attempt at introducing signature regeneration to llvm-objcopy. In this diff: https://reviews.llvm.org/D109840, a script was introduced to test the validity of a code signature. In this diff: https://reviews.llvm.org/D109803 (now reverted), an effort was made to extract the signature generation behavior out of LLD into a common location for use in llvm-objcopy. In this diff: https://reviews.llvm.org/D109972 it was decided that there was no appropriate common location and that a small amount of duplication to bring signature generation to llvm-objcopy would be better. This diff introduces this duplication. Summary Prior to this change, if a LC_CODE_SIGNATURE load command was included in the binary passed to llvm-objcopy, the command and associated section were simply copied and included verbatim in the new binary. If rest of the binary was modified at all, this results in an invalid Mach-O file. This change regenerates the signature rather than copying it. The code_signature_lc.test test was modified to include the yaml representation of a small signed MachO executable in order to effectively test the signature generation. Reviewed By: alexander-shaposhnikov, #lld-macho Differential Revision: https://reviews.llvm.org/D111164	2021-10-26 14:51:13 -07:00
zhijian	c2d2fb5093	address an test error on window os , exclude the test llvm/test/tools/llvm-readobj/XCOFF/xcoff-auxiliary-header.test from windows OS. http://45.33.8.238/win/47662/step_11.txt for https://reviews.llvm.org/D82549	2021-10-26 13:56:52 -04:00
zhijian	158083f0de	[AIX][XCOFF] parsing xcoff object file auxiliary header Summary: The patch supports parsing the xcoff object file auxiliary header with llvm-readobj with option "auxiliary-headers" the format of auxiliary header as https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/filesreference/XCOFF.html#XCOFF__fyovh386shar Reviewers: James Henderson, Jason Liu, Hubert Tong, Esme yi, Sean Fertile. Differential Revision: https://reviews.llvm.org/D82549	2021-10-26 10:40:25 -04:00
wlei	a5f411b7f8	[llvm-profgen] Allow unsymbolized profile as perf input This change allows the unsymbolized profile as input. The unsymbolized profile is created by `llvm-profgen` with `--skip-symbolization` and it's after the sample aggregation but before symbolization , so it has much small file size. It can be used for sample merging and trimming, also is useful for debugging or adding test cases. A switch `--unsymbolized-profile=file-patch` is added for this. Format of unsymbolized profile: ``` [context stack1] # If it's a CS profile number of entries in RangeCounter from_1-to_1:count_1 from_2-to_2:count_2 ...... from_n-to_n:count_n number of entries in BranchCounter src_1->dst_1:count_1 src_2->dst_2:count_2 ...... src_n->dst_n:count_n [context stack2] ...... ``` Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D111750	2021-10-25 23:58:08 -07:00
Jack Anderson	d7733f8422	[DebugInfo] Expand ability to load 2-byte addresses in dwarf sections Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors. The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111953	2021-10-21 17:31:00 -07:00
Wenlei He	e8c245dcd3	[llvm-profgen] Skip duplication factor outside of body sample computation We incorrectly use duplication factor for total samples even though we already accumulate samples instead of taking MAX. It causes profile to have bloated total samples for functions with loop unrolled or vectorized. The change fix the issue for total sample, head sample and call target samples. Differential Revision: https://reviews.llvm.org/D112042	2021-10-19 23:10:45 -07:00
Arthur Eubanks	9660563950	[llvm-reduce] Add reduction passes to reduce operands to undef/1/0 Having non-undef constants in a final llvm-reduce output is nicer than having undefs. This splits the existing reduce-operands pass into three, one which does the same as the current pass of reducing to undef, and two more to reduce to the constant 1 and the constant 0. Do not reduce to undef if the operand is a ConstantData, and do not reduce 0s to 1s. Reducing GEP operands very frequently causes invalid IR (since types may not match up if we index differently into a struct), so don't touch GEPs. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D111765	2021-10-19 15:25:21 -07:00
Simon Pilgrim	0bb32b1b21	[X86][SLM] Fix BitTest+Set uops + port usage Both ports are required for BitTest ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures and what Intel AoM / Agner reports as well.	2021-10-17 18:13:15 +01:00
Simon Pilgrim	5ed5df4802	[X86][SLM] Fix uops for PCMPISTR/PCMPISTR instructions Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
Simon Pilgrim	680afaaa5d	[X86][SLM] Fix uops for PCLMULQDQ Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
Simon Pilgrim	498c7236bc	[X86][SLM] +1uop for PSHUFBrm xmm Extra 1uop for folded pshufb ops, based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
djtodoro	c450e47a8c	[llvm-dwarfdump] Fix unsigned overflow when calculating stats This fixes https://bugs.llvm.org/show_bug.cgi?id=51652. The idea is to bump all the stat fields to 64-bit wide unsigned integers. I've confirmed this resolves the use case for chromium. Differential Revision: https://reviews.llvm.org/D109217	2021-10-15 12:15:58 +02:00
Craig Topper	3ff9cc01f2	[X86] Use CMOVNS for abs instead of CMOVGE. CMOVGE reads SF and OF. CMOVNS only reads SF. This matches with other recent changes to use a single flag where possible. It also matches gcc codegen. I believe this technically changes whether the conditioanl move happens on INT_MIN, but for INT_MIN both registers are the same so it doesn't matter. Differential Revision: https://reviews.llvm.org/D111826	2021-10-14 12:28:28 -07:00
Kai Nacke	b050564d3e	[AIX] Ignore case when comparing output from od POSIX does not define the exact output from od tool. While most implementations use lower case characters in hex output, the z/OS USS implementation uses upper case characters. To avoid LIT failures, the FileCheck option to ignore the case must be used when checking hex bytes. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D111427	2021-10-14 13:51:02 -04:00
Wenlei He	a316343e19	[llvm-profgen] Allow generating AutoFDO profile from CSSPGO binary Add `-use-dwarf-correlation` switch to allow llvm-profgen to generate AutoFDO profile for binaries built with CSSPGO (pseudo-probe). Differential Revision: https://reviews.llvm.org/D111776	2021-10-14 09:11:56 -07:00
wlei	30ca33eab0	[llvm-profgen] Ignore the whole trace with the leading external branch The first LBR entry can be an external branch, we should ignore the whole trace. ``` 7f7448e889e4 0x7f7448e889e4/0x7f7448e88826/P/-/-/1 0x7f7448e8899f/0x7f7448e889d8/P/-/-/4 ... ``` Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D111749	2021-10-13 16:52:29 -07:00
Michael Kruse	dd71b65ca8	[llvm-reduce] Introduce operands-to-args pass. Instead of setting operands to undef as the "operands" pass does, convert the operands to a function argument. This avoids having to introduce undef values into the IR which have some unpredictability during optimizations. For instance, define void @func() { entry: %val = add i32 32, 21 store i32 %val, i32* null ret void } is reduced to define void @func(i32 %val) { entry: %val1 = add i32 32, 21 store i32 %val, i32* null ret void } (note that the instruction %val is renamed to %val1 when printing the IR to avoid ambiguity; ideally %val1 would be removed by dce or the instruction reduction pass) Any call to @func is replaced with a call to the function with the new signature and filled with undef. This is not ideal for IPA passes, but those out-of-scope for now. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D111503	2021-10-13 09:54:03 -05:00
Arthur Eubanks	337cf0a5ab	[llc] Support -time-trace in llc Mostly copied from opt.cpp. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D111466	2021-10-11 10:16:46 -07:00
Esme-Yi	a00ff71668	[XCOFF] Improve error message context. Summary: This patch improves the error message context of the XCOFF interfaces by providing more details. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D110320	2021-10-11 02:52:20 +00:00
David Green	adec922361	[AArch64] Make -mcpu=generic schedule for an in-order core We would like to start pushing -mcpu=generic towards enabling the set of features that improves performance for some CPUs, without hurting any others. A blend of the performance options hopefully beneficial to all CPUs. The largest part of that is enabling in-order scheduling using the Cortex-A55 schedule model. This is similar to the Arm backend change from `eecb353d0e` which made -mcpu=generic perform in-order scheduling using the cortex-a8 schedule model. The idea is that in-order cpu's require the most help in instruction scheduling, whereas out-of-order cpus can for the most part out-of-order schedule around different codegen. Our benchmarking suggests that hypothesis holds. When running on an in-order core this improved performance by 3.8% geomean on a set of DSP workloads, 2% geomean on some other embedded benchmark and between 1% and 1.8% on a set of singlecore and multicore workloads, all running on a Cortex-A55 cluster. On an out-of-order cpu the results are a lot more noisy but show flat performance or an improvement. On the set of DSP and embedded benchmarks, run on a Cortex-A78 there was a very noisy 1% speed improvement. Using the most detailed results I could find, SPEC2006 runs on a Neoverse N1 show a small increase in instruction count (+0.127%), but a decrease in cycle counts (-0.155%, on average). The instruction count is very low noise, the cycle count is more noisy with a 0.15% decrease not being significant. SPEC2k17 shows a small decrease (-0.2%) in instruction count leading to a -0.296% decrease in cycle count. These results are within noise margins but tend to show a small improvement in general. When specifying an Apple target, clang will set "-target-cpu apple-a7" on the command line, so should not be affected by this change when running from clang. This also doesn't enable more runtime unrolling like -mcpu=cortex-a55 does, only changing the schedule used. A lot of existing tests have updated. This is a summary of the important differences: - Most changes are the same instructions in a different order. - Sometimes this leads to very minor inefficiencies, such as requiring an extra mov to move variables into r0/v0 for the return value of a test function. - misched-fusion.ll was no longer fusing the pairs of instructions it should, as per D110561. I've changed the schedule used in the test for now. - neon-mla-mls.ll now uses "mul; sub" as opposed to "neg; mla" due to the different latencies. This seems fine to me. - Some SVE tests do not always remove movprfx where they did before due to different register allocation giving different destructive forms. - The tests argument-blocks-array-of-struct.ll and arm64-windows-calls.ll produce two LDR where they previously produced an LDP due to store-pair-suppress kicking in. - arm64-ldp.ll and arm64-neon-copy.ll are missing pre/postinc on LPD. - Some tests such as arm64-neon-mul-div.ll and ragreedy-local-interval-cost.ll have more, less or just different spilling. - In aarch64_generated_funcs.ll.generated.expected one part of the function is no longer outlined. Interestingly if I switch this to use any other scheduled even less is outlined. Some of these are expected to happen, such as differences in outlining or register spilling. There will be places where these result in worse codegen, places where they are better, with the SPEC instruction counts suggesting it is not a decrease overall, on average. Differential Revision: https://reviews.llvm.org/D110830	2021-10-09 15:58:31 +01:00
Qiu Chaofan	573531fb1f	Fix typo of colon to semicolon in lit tests	2021-10-09 10:03:50 +08:00
Abhina Sreeskantharajan	7d7b139042	[test] Use host platform specific error message substitution This patch modifies the testcase to use error substitution so it will pass on all platforms. Reviewed By: fanbo-meng, muiez Differential Revision: https://reviews.llvm.org/D111320	2021-10-08 13:52:31 -04:00
wlei	b1a45c62f0	[llvm-profgen] Ignore branch count against outline function For some transformations like hot-cold split or coro split, it can outline its part of function ranges. Since sample loader is the early stage of backend and no split happens at that time, compiler can't recognize those function, so in llvm-profgen we should attribute the sample to the original function. This is already done for the body range samples since we use the symbols from dwarf which is created before the split. But for branch samples, the call from master function to its outlined function is actually not a call to the original function, we shouldn't add head/callsie samples for it. So instead of dwarf symbol, we use the symbols from symbol table and ignore those functions with special suffixes(like `.cold` ,`.resume`) for accumulating the callsite/head samples. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110864	2021-10-07 14:03:34 -07:00
gbreynoo	9072183cb6	[llvm-objdump] Fix --prefix and --prefix-strip In the command guide --prefix and --prefix-strip is used in the form --prefix=<prefix> however currently it is used in the form --prefix <prefix>. This change fixes these options to match the command guide. Differential Revision: https://reviews.llvm.org/D110551	2021-10-07 15:53:45 +01:00
wlei	16516f8925	[llvm-profgen] Support symbol list for accurate profile Differential Revision: https://reviews.llvm.org/D110859	2021-10-06 11:41:39 -07:00
Petr Hosek	24c615fa6b	[InstrProfData] Bump the raw profile version to 8 This is to account for the change that made CountersPtr in __profd_ relative which landed in `a1532ed275`. That change hasn't updated the raw profile version, and while the profile layout stayed the same, profiles generated by tip-of-tree LLVM are incompatible with 13.x tooling. Differential Revision: https://reviews.llvm.org/D111123	2021-10-05 09:57:56 -07:00
gbhyamso	02895eede1	[llvm-cxxfilt][NFC] Fix test for running in Windows cmd The test llvm\test\tools\llvm-cxxfilt\delimiters.test started failling when run from cmd.exe on Windows after D110986 which added a unicode character (⦙) to it. Piping the unicode character in cmd.exe causes it to be converted to a '?'. That causes the test to fail because the llvm-cxxfilt output becomes Foo?Bar rather than the expected Foo⦙Bar. Redirect the echo output to and from a temporary file to get around this problem. It's not entirely clear what the root cause is, but two separate downstream builders are tripping up on this, so we are landing the work around for the time being. Differential Revision: https://reviews.llvm.org/D111072	2021-10-05 12:10:06 +01:00
wlei	31a5cb3292	[llvm-profgen] Filter out invalid debug line Differential Revision: https://reviews.llvm.org/D110081	2021-10-04 19:09:06 -07:00
wlei	46cf7d75d9	[llvm-profgen] Add duplication factor for line-number based profile This change adds duplication factor multiplier while accumulating body samples for line-number based profile. The body sample count will be `duplication-factor * count`. Base discriminator and duplication factor is decoded from the raw discriminator, this requires some refactor works. Differential Revision: https://reviews.llvm.org/D109934	2021-10-04 19:08:55 -07:00
Simon Pilgrim	7cae0daee6	[X86][Atom] Fix BSR/BSF uops + port usage Both ports are required for BitScan ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.	2021-10-02 19:09:44 +01:00
Simon Pilgrim	8e7f6039fa	[X86] Atom SSE shift-by-variable take 2uops/3uops not 1uop Based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.	2021-10-02 12:28:41 +01:00
Tomasz Miąsko	f33274c7bf	[llvm-cxxfilt] Replace isalnum with isAlnum from StringExtras D104366 introduced a new llvm-cxxfilt test with non-ASCII characters, which caused a failure on llvm-clang-x86_64-expensive-checks-win builder, with a stack trace suggesting issue in a call to isalnum. The argument to isalnum should be either EOF or a value that is representable in the type unsigned char. The llvm-cxxfilt does not perform a cast from char to unsigned char before the call, so the value might be out of valid range. Replace the call to isalnum with isAlnum from StringExtras, which takes a char as the argument. This also makes the check independent of the current locale. Differential Revision: https://reviews.llvm.org/D110986	2021-10-02 08:54:04 +02:00
zhijian	5b44c716ee	[AIX]implement the --syms and using "symbol index and qualname" for --sym --symbol--description for llvm-objdump for xcoff Summary: for xcoff : implement the getSymbolFlag and getSymbolType() for option --syms. llvm-objdump --sym , if the symbol is label, print the containing section for the symbol too. when using llvm-objdump --sym --symbol--description, print the symbol index and qualname for symbol. for example: --symbol-description 00000000000000c0 l .text (csect: (idx: 2) .foov[PR]) (idx: 3) .foov and without --symbol-description 00000000000000c0 l .text (csect: .foov) .foov Reviewers: James Henderson,Esme Yi Differential Revision: https://reviews.llvm.org/D109452	2021-10-01 12:37:51 -04:00
Florian Hahn	57fbb9ed0e	[llvm-reduce] Skip updating calls where OldF isn't the called fn. When replacing function calls, skip call instructions where the old function is not the called function, but e.g. the old function is passed as an argument. This fixes a crash due to trying to construct invalid IR for the test case. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D109759	2021-10-01 10:52:48 +01:00
Fangrui Song	8971b99c83	[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling convention or the vector calling convention. See https://github.com/riscv/riscv-elf-psabi-doc/pull/190 Differential Revision: https://reviews.llvm.org/D107949	2021-09-29 16:56:52 -07:00
Wael Yehia	8b8da01d88	Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace." This reverts commit `a60405cf03`.	2021-09-29 19:43:35 +00:00
Michael Kruse	d9562a8e45	[llvm-reduce] Reduce metadata references. The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting. Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle. This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes. Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind). Reviewed By: aeubanks, swamulism Differential Revision: https://reviews.llvm.org/D110534	2021-09-29 11:25:35 -05:00
David Green	e9adcbde31	[AArch64] Model Cortex-A55 Q register NEON instructions Cortex-A55 has 2 64bit NEON vector units, meaning a 128bit instruction requires taking both units (and can only be issued as the first instruction in a dual issue pair). This patch models that by splitting the WriteV SchedWrite into two - the WriteVd that reads/writes only 64bit operands, and the WriteVq that read/writes 128bit registers. The A55 schedule then uses this distinction to model the WriteVq as taking both resource units, and starting a Schedule Group and WriteVd as taking one as before. I believe this is more correct, even if it does not lead to much better performance. Differential Revision: https://reviews.llvm.org/D108766	2021-09-29 16:55:31 +01:00
Wael Yehia	a60405cf03	[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace. Reviewed by: steven_wu, fhahn, tejohnson Differential Revision: https://reviews.llvm.org/D110075	2021-09-29 12:17:53 +00:00
Igor Kudrin	7b424b9333	[llvm-objcopy] Rename relocation sections together with their targets. As for now, llvm-objcopy renames only sections that are specified explicitly in --rename-section, while GNU objcopy keeps names of relocation sections in sync with their targets. For example: > readelf -S test.o ... [ 1] .foo PROGBITS [ 2] .rela.foo RELA > objcopy --rename-section .foo=.bar test.o gnu.o > readelf -S gnu.o ... [ 1] .bar PROGBITS [ 2] .rela.bar RELA > llvm-objcopy --rename-section .foo=.bar test.o llvm.o > readelf -S llvm.o ... [ 1] .bar PROGBITS [ 2] .rela.foo RELA This patch makes llvm-objcopy to match the behavior of GNU objcopy better. Differential Revision: https://reviews.llvm.org/D110352	2021-09-29 16:36:37 +07:00
wlei	a03cf331e1	[llvm-profgen] Strip context to support non-CS profile generation for hybrid sample Differential Revision: https://reviews.llvm.org/D109769	2021-09-28 12:20:23 -07:00
Leonard Chan	b9f547e8e5	[llvm][profile] Add padding after binary IDs Some tests with binary IDs would fail with error: no profile can be merged. This is because raw profiles could have unaligned headers when emitting binary IDs. This means padding should be emitted after binary IDs are emitted to ensure everything else is aligned. This patch adds padding after each binary ID to ensure the next binary ID size is 8-byte aligned. This also adds extra checks to ensure we aren't reading corrupted data when printing binary IDs. Differential Revision: https://reviews.llvm.org/D110365	2021-09-28 11:50:50 -07:00
Fangrui Song	74a47e54be	[llvm-objdump] Fix -R display and support ET_EXEC * Add a newline before `DYNAMIC RELOCATION RECORDS` (see D101796) * Add the missing `OFFSET TYPE VALUE` line * Align columns Note: llvm-readobj/ELFDumper.cpp `loadDynamicTable` has sophisticated PT_DYNAMIC code which is unavailable in llvm-objdump. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D110595	2021-09-28 09:58:27 -07:00
Alex Richardson	547e5e4ae6	[update_llc_test_checks.py] Fix MIPS ASM regex for functions with EH On MIPS, functions with exception handling code emits an additional temporary label at the start of the function (due to UseAssignmentForEHBegin): _Z8do_catchv: # @_Z8do_catchv .Ltmp3: .set .Lfunc_begin0, .Ltmp3 .cfi_startproc .cfi_personality 128, DW.ref.__gxx_personality_v0 .cfi_lsda 0, .Lexception0 .frame $c11,48,$c17 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .set nomacro .set noat # %bb.0: # %entry The `[^:]*` regex was terminating the search after .Ltmp<N>: and therefore not detecting functions with exception handling. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D100027	2021-09-28 17:57:36 +01:00
Alex Richardson	ee3109b044	[update_llc_test_checks] Baseline test for D100027 Show that we fail to generate CHECK lines for MIPS64 functions with EH. Differential Revision: https://reviews.llvm.org/D110408	2021-09-28 17:57:36 +01:00
Jozef Lawrynowicz	6cfb4d46ba	[llvm-readobj] Support dumping of MSP430 ELF attributes The MSP430 ABI supports build attributes for specifying the ISA, code model, data model and enum size in ELF object files. Differential Revision: https://reviews.llvm.org/D107969	2021-09-28 00:56:11 +03:00
modimo	ce6ed64a69	[llvm-profdata] Extend support of --topn to sample profiles Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110449	2021-09-24 16:42:46 -07:00
Wei Mi	80865f7579	Add "REQUIRES: zlib" in forward-compatible.test since it handles compressed file.	2021-09-24 15:35:07 -07:00
Wei Mi	e8b376547b	Fixed a bug in https://reviews.llvm.org/rG8eb617d719bdc6a4ed7773925d2421b9bbdd4b7a . For compressed profile when reading an unknown section, the data reader pointer adjustment was incorrect. This patch fixed that.	2021-09-24 15:23:45 -07:00
Jonas Devlieghere	d0649320bf	[dsymutil] Update union-fwd-decl.test for Windows Remove path separators from CHECK-lines in union-fwd-decl.test	2021-09-24 15:07:22 -07:00
David Blaikie	9911af4b91	WIP: Verify -gsimple-template-names=mangled values Clang will encode names that should be able to be simplified as "_STNname\|<template, args>" (eg: "_STNt1\|<int>") - this verification mode will detect these names, decode them, create the original name ("t1<int>") and the simple name ("t1") - letting the simple name run through the usual rebuilding logic - then compare the two sources of the full name - the rebuilt and the _STN encoding. This helps ensure that -gsimple-template-names is lossless.	2021-09-24 14:28:18 -07:00
Jonas Devlieghere	62d6ff5e9e	[dsymutil] Track incompleteness across unions When determining the incompleteness of a DIE based on its children, make sure we propagate it across union types. See test case for an example. Without this patch we never emit the definition of Container_ivars. Differential revision: https://reviews.llvm.org/D110443	2021-09-24 14:26:37 -07:00
wlei	1422fa5fab	[llvm-profgen] Unify output format of different unsymbolized profiles Differential Revision: https://reviews.llvm.org/D110080	2021-09-24 14:18:00 -07:00
wlei	28277e9b48	[AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code In order to be consistent with compiler that interprets zero count as unexecuted(cold), this change reports zero-value count for unexecuted part of function code. For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value. After all ranges are merged and converted into disjoint ranges, the remaining zero count will indicates the unexecuted(cold) part of the function. This change also extends the current `findDisjointRanges` method which now can support adding zero-value range. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D109713	2021-09-24 14:15:05 -07:00
wlei	d5f2013004	[AutoFDO][llvm-profgen] Profile generation for LBR(non-CS) sample This patch introduces non-CS AutoFDO profile generation into LLVM. The profile is supposed to be well consumed by compiler using `-fprofile-sample-use=[profile]`. After range and branch counters are extracted from the LBR sample, here we go through each addresses for symbolization, create FunctionSamples and populate its sub fields like TotalSamples, BodySamples and HeadSamples etc. For inlined code, as we need to map back to original code, so we always add body samples to the leaf frame's function sample. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D109551	2021-09-24 13:55:34 -07:00
wlei	a7cdcf25c1	[llvm-profgen] Ignore invalid perf line in LBR record Similar to https://reviews.llvm.org/D109637, there is a whole invalid line of message in perfscript. ``` warning: Invalid address in LBR record at line 14118674: Processed 14138923 events and lost 1 chunks! warning: Invalid address in LBR record at line 14118676: Check IO/CPU overload! ``` This only happened for LBR only perfscript, hybridperfscript have a check of " 0x" to make sure it's the LBR perf line. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110424	2021-09-24 13:44:57 -07:00
Simon Pilgrim	dade83c02a	[X86][SLM] Fix ADDQ/SUBQ/CMPEQQ throughput to account for running on either port. Testing on a SLM box suggests these can run on either port, but the throughput is 4cy on either (inc MMX versions). Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-24 10:06:14 +01:00
Wenlei He	81c249784f	[llvm-profgen] Use hot threshold for context merging and trimming Without preinliner, we need to tune down the cold count cutoff to merge/trim more context to limit profile size for large components. However it doesn't make sense for cold threshold to be higher than hot threshold, so we now change to use hot threshold as merging/trimming cut off instead. Differential Revision: https://reviews.llvm.org/D110212	2021-09-22 15:01:51 -07:00
Hongtao Yu	734f4d832c	[llvm-profgen] An option to dump disasm of specified symbols For large app, dumping disasm of the whole program can be slow and result in gianant output. Adding a switch to dump specific symbols only. Reviewed By: wlei Differential Revision: https://reviews.llvm.org/D110079	2021-09-22 10:32:59 -07:00
Hongtao Yu	d9b511d8e8	[CSSPGO] Set PseudoProbeInserter as a default pass. Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110209	2021-09-22 09:09:48 -07:00
Sebastian Neubauer	ecd5145c27	[Utils] Replace llc with cat for tests Make the update_llc_test_checks script test independant of llc behavior by using cat with static files to simulate llc output. This allows changing llc without breaking the script test case. The update script is executed in a temporary directory, so the llc-generated assembly files are copied there. %T is deprecated, but it allows copying a file with a predictable filename. Differential Revision: https://reviews.llvm.org/D110143	2021-09-22 10:10:35 +02:00
David Blaikie	49c519a848	DebugInfo: Rebuild decltype(nullptr) as 'std::nullptr_t' Now that Clang's been changed to render nullptr types/template parameters as 'std::nullptr_t' do the same thing down here. (Clang commit: `131e878664` )	2021-09-21 11:37:30 -07:00
Paul Robinson	fa822a2ee5	[DebugInfo] Add test for dumping DW_AT_defaulted	2021-09-20 16:43:53 -04:00
Alex Richardson	817e23d481	[update_mir_test_checks.py] Use -NEXT FileCheck directories Previously the script emitted output using plain CHECK directives. This can result in a test passing even if there are some instructions between CHECK directives that should have been removed. It also makes debugging tests that have the output in a different order more difficult since FileCheck can match with a later line and then complain about the "wrong" directive not being found. This will cause quite large diffs when updating existing tests, but I'm not sure we need an opt-in flag here. Depends on D109765 (pre-commit tests) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D109767	2021-09-20 12:55:56 +01:00
Alex Richardson	7b68c0725d	pre-commit test for D109767 Differential Revision: https://reviews.llvm.org/D109765	2021-09-20 12:55:56 +01:00
David Blaikie	cb42bb3550	llvm-dwarfdump: pretty type printing: print fully qualified names in function type parameter types	2021-09-19 18:49:15 -07:00
David Blaikie	606ea0dd2a	llvm-dwarfdump: support for type printing "decltype(nullptr)" as "nullptr_t" This should probably be rendered as "std::nullptr_t" but for now clang uses the unqualified name (which is ambiguous with possible user defined name in the global namespace), so match that here.	2021-09-19 17:33:56 -07:00
David Blaikie	11e0b79b05	llvm-dwarfdump: Don't print even an empty string when a type is unprintable	2021-09-19 17:03:10 -07:00
David Blaikie	5bfe5207ef	llvm-dwarfdump: Pretty print names qualified/with scopes	2021-09-19 16:36:01 -07:00
David Blaikie	372e2c24b6	llvm-dwarfdump: Pretty printing types including a space between const and parenthesized references/pointers to arrays	2021-09-19 13:32:53 -07:00
David Blaikie	f09ca5c646	DWARFDie: Improve type printing for function and array types - with qualifiers (cv/reference) and pointers to them	2021-09-19 12:59:31 -07:00
Simon Pilgrim	f855ef2601	[X86][Atom] Fix FP uops + port usage Both ports are required in most cases. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well. Noticed while trying to improve fp costs for vectorization via the D103695 helper script.	2021-09-19 20:39:20 +01:00
David Blaikie	2ca637c976	llvm-dwarfdump: Refactor type pretty printing tests Move most type tests to a pre-generated assembly file to make it easier to add more weird cases without having to hand craft more DWARF. Move the novel array types that aren't reachable via clang-generated DWARF to a separate file for easy maintenance.	2021-09-19 09:30:38 -07:00
Simon Pilgrim	cf8fac7d07	[X86][Atom] Specific uops for all IMUL/IDIV instructions Based off a mixture of llvm-exegesis captures (PR36895) and Intel AoM / Agner / InstLatX64 reports.	2021-09-19 16:58:52 +01:00
Simon Pilgrim	e381d8b243	[X86][Atom] Fix (U)COMISS/SD uops, latency and throughput Both ports are required, for reg and mem variants - we can also use the WriteFComX class directly and remove the unnecessary InstRW overrides. Matches what Intel AoM / Agner / InstLatX64 report as well.	2021-09-19 12:44:44 +01:00
Samuel	f18c0739b3	[llvm-reduce] Add reduce operands pass Add reduction to set operands to default values Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D108903	2021-09-17 12:32:15 -07:00
Simon Pilgrim	5ebe95e256	[X86][Atom] Fix integer shuffles uops, latency and throughput The MMX pack/unpck shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only). The SSE pslldq/psrldq shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only). The SSE pshufb shuffles use 4uops (+1 load). Noticed the pslldq/psrldq issue while trying to improve reduction costs via the D103695 helper script, and fixed the others while reviewing. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-17 12:11:54 +01:00
Wenlei He	446e21623c	[llvm-profgen] Use context-sensitive byte size cost for preinliner decisions by default Turn on `use-context-cost-for-preinliner` to use context-sensitive byte size cost for preinliner decisions by default. This is a more accurate proxy of inline cost than profile size. We tested on our large workload that it delivers measureable CPU improvement. Differential Revision: https://reviews.llvm.org/D109893	2021-09-16 10:36:12 -07:00
serge-sans-paille	85f2ae57f7	Be more flexible on the storage type allowed for llvm::Any::TypeId::Id This is a follow-up to `2c42a73d6c`.	2021-09-16 11:01:53 +02:00
Arthur Eubanks	5d78e33ce5	[test] Move some llvm-extract tests into the proper directory	2021-09-15 15:42:04 -07:00
serge-sans-paille	2c42a73d6c	Add extra check for llvm::Any::TypeId visibility This check should ensure we don't reproduce the problem fixed by `02df443d28` More accurately, it checks every llvm::Any::TypeId symbol in libLLVM-x.so and make sure they have weak linkage and are not local to the library, which would lead to duplicate definition if another weak version of the symbol is defined in another linked library. Differential Revision: https://reviews.llvm.org/D109252	2021-09-15 08:32:55 +02:00
Esme-Yi	945df8bc4c	[obj2yaml][XCOFF] Dump sections Summary: This patch implements parsing sections for obj2yaml on AIX. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D98003	2021-09-15 05:16:33 +00:00
Hongtao Yu	0057c7185d	[CSSPGO][llvm-profgen] Truncate stack samples with invalid return address. Invalid frame addresses exist in call stack samples due to bad unwinding. This could happen to frame-pointer-based unwinding and the callee functions that do not have the frame pointer chain set up. It isn't common when the program is built with the frame pointer omission disabled, but can still happen with third-party static libs built with frame pointer omitted. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D109638	2021-09-14 21:56:22 -07:00
Martin Storsjö	63784b9a75	[llvm-readobj] [COFF] Resolve relocations pointing at section symbols for arm64 too This syncs parts from the x86 implementation to the ARMWinEH implementation. Currently, neither of the compilers targeting COFF/arm64 (MSVC, LLVM) produce such relocations, but LLVM might after a later patch. Differential Revision: https://reviews.llvm.org/D109650	2021-09-14 11:04:46 +03:00
Martin Storsjö	197084fcee	[llvm-readobj] [COFF] Try to resolve symbols in unwind info on x86 This is the same as we do on arm64 already for the MSVC style label symbols, but also handle the way GCC produces it - with all relocations pointing at the .text section symbol, with various offsets. Differential Revision: https://reviews.llvm.org/D109649	2021-09-14 11:04:46 +03:00
Esme-Yi	b98c3e957f	[yaml2obj][XCOFF] add the SectionIndex field for symbol. Summary: Add the SectionIndex field for symbol. 1: a symbol can reference a section by SectionName or SectionIndex. 2: a symbol can reference a section by both SectionName and SectionIndex. 3: if both Section and SectionIndex are specified, but the two values refer to different sections, an error will be reported. 4: an invalid SectionIndex is allowed. 5: if a symbol references a non-existent section by SectionName, an error will be reported. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D109566	2021-09-14 06:18:03 +00:00
Esme-Yi	909f3d7380	[yaml2obj][XCOFF] customize the string table Summary: The patch adds support for yaml2obj customizing the string table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D107421	2021-09-13 09:24:38 +00:00
Simon Pilgrim	65ad09da0e	[X86][SLM] Fix DIVPD/DIVPS/RCPPS/RSQRTPS/SQRTPD/SQRTPS/DPPD/DPPS uops, latency and throughput The packed variants of the instructions had been modelled as the same as the scalar variants. Reported during a run of llvm-exegesis on a cheap SLM box and matches what Agner / InstLatX64 report as well.	2021-09-13 08:36:43 +01:00
Simon Pilgrim	df975e4590	[X86][SLM] Fix PSAD/MPSAD uops, latency and throughput Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-11 11:44:09 +01:00
Simon Pilgrim	484944ac3b	[X86][SLM] Fix HADD/HSUB uops, latency and throughput Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-11 11:44:09 +01:00
Keith Smiley	e972e49b11	[llvm-cov] Add error for invalid -path-equivalence format Differential Revision: https://reviews.llvm.org/D109042	2021-09-10 18:34:37 -07:00
Sam Clegg	e4b2f3054a	[WebAssembly][libObject] Avoid re-use of Section object during parsing The re-use of this struct across iterations of the loop was causing fields (specifically Name) to be incorrectly shared between multiple sections. Differential Revision: https://reviews.llvm.org/D108984	2021-09-10 09:30:50 -04:00
Serge Bazanski	231bfaab31	[Lanai] fix MC / objdump D78776 removed is{Call,Branch,UnconditionalBranch} guards in objdump before calling MCInstrAnalysis::evaluateBranch. This is fine for other architectures as they gracefully handle evaluateBranch being called on non-branches. However, the Lanai MCInstrAnalysis implementation didn't and that change caused it to crash. This inserts the same guards back into Lanai's evaluateBranch implementation and adds a smoke test that exercises `llc \| objdump` so this kind of regression is hopefully caught next time. Reviewed By: jpienaar, MaskRay Differential Revision: https://reviews.llvm.org/D107593	2021-09-10 10:46:13 +00:00
Alfonso Sánchez-Beato	b25ab4f313	[llvm-objcopy][COFF] Fix test for debug dir presence If the number of directories was 6 (equal to the DEBUG_DIRECTORY index), patchDebugDirectory() was run even though the debug directory is actually the 7th entry. Use <= in the comparison to fix that. This fixes https://llvm.org/PR51243 Differential Revision: https://reviews.llvm.org/D106940 Reviewed by: jhenderson	2021-09-10 09:57:18 +01:00
Alfonso Sánchez-Beato	b33fd31772	[yaml2obj][COFF] Allow variable number of directories Allow variable number of directories, as allowed by the specification. NumberOfRvaAndSize will default to 16 if not specified, as in the past. Reviewed by: jhenderson Differential Revision: https://reviews.llvm.org/D108825	2021-09-09 11:16:56 +01:00
Wei Mi	8eb617d719	[SampleFDO] Allow forward compatibility when adding a new section for extbinary format. Currently when we add a new section in the profile format and generate a profile containing the new section, older compiler which reads the new profile will issue an error. The forward incompatibility can cause unnecessary churn when extending the profile. This patch removes the incompatibility when adding a new section for extbinary format. Differential Revision: https://reviews.llvm.org/D109398	2021-09-07 19:38:43 -07:00
Maksim Panchenko	6300e4ac58	[llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations Print relocations interleaved with disassembled instructions for executables with relocatable sections, e.g. those built with "-Wl,-q". Differential Revision: https://reviews.llvm.org/D109016	2021-09-07 11:24:24 -07:00
Roman Lebedev	e030f808ec	[Exegesis] Native clusterization: sub-partition by sched class id Currently native clusterization simply groups all benchmarks by the opcode of key instruction, but that is suboptimal in certain cases, e.g. where we can already tell that the particular instructions already resolve into different sched classes.	2021-09-07 17:54:37 +03:00
Roman Lebedev	b3b9b297a0	[NFC][exegesis] Add test for the following patch	2021-09-07 17:54:36 +03:00
Simon Pilgrim	056b409ceb	[llvm-exegesis][x86] Limit llvm-exegesis analysis tests to x86_64 triple hosts Attempting to fix an issue with test failures on arm m1 apple macintoshes reported on D109353	2021-09-07 14:35:52 +01:00
Simon Pilgrim	6a9e2764f6	[llvm-exegesis] Analysis tests should run even without libpfm (PR51687) Move inverse_throughput, latency and uops to sub-directories (like we already do for lbr), which require libpfm, so we can relax the lit limits for analysis tests in the x86 root directory. Differential Revision: https://reviews.llvm.org/D109353	2021-09-07 13:58:05 +01:00
Andrew Litteken	bd4b1b5f6d	[IRSim] Adding support for recognizing branch similarity The current IRSimilarityIdentifier does not try to find similarity across blocks, this patch provides a mechanism to compare two branches against one another, to find similarity across basic blocks, rather than just within them. This adds a step in the similarity identification process that labels all of the basic blocks so that we can identify the relative branching locations. Within an IRSimilarityCandidate we use these relative locations to determine whether if the branching to other relative locations in the same region is the same between branches. If they are, we consider them similar. We do not consider the relative location of the branch if the target branch is outside of the region. In this case, both branches must exit to a location outside the region, but the exact relative location does not matter. Reviewers: paquette, yroux Differential Revision: https://reviews.llvm.org/D106989	2021-09-06 11:55:38 -07:00
Simon Pilgrim	2005ae15a6	[X86][SLM] WriteVecIMul instructions only take 1uop (REAPPLIED) The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop. I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD. But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.	2021-09-04 15:03:56 +01:00
Simon Pilgrim	ac51d69208	Revert rG994da657076900f5ad7fe593c3b5e5f89ab3d53d "[X86][SLM] WriteVecIMul instructions only take 1uop" This changed some codegen tests that I forgot about in my rebase, I'll recommit shortly with a fix.	2021-09-04 13:39:10 +01:00
Simon Pilgrim	994da65707	[X86][SLM] WriteVecIMul instructions only take 1uop The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop. I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD. But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	c6371020a8	[X86][SLM] RMW instructions don't require an extra uop For RMW instructions, the load and store hold the MEC for an extra cycle, but within the same single uop. This is alluded to in the Intel AOM: "The MEC also owns the MEC RSV, which is responsible for scheduling of all loads and stores. Load and store instructions go through addresses generation phase in program order to avoid on-the-fly memory ordering later in the pipeline. Therefore, an unknown address will stall younger memory instructions." Noticed while trying to get a cheap SLM test box up and running with llvm-exegesis - RMW arithmetic is always 1uop - and matches what Agner / InstLatX64 report as well.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	da965a77d5	[X86][SLM] Fix MUL uops, latency and throughput These were all set to the same best case mul i32 values (which seems to be the only version of MUL that SLM actually performs well with). Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	7d062d2c47	[X86][Atom] MUL/DIV instructions require both ports, not either. Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM.	2021-09-04 11:58:09 +01:00
Richard Smith	02fe58d628	DebugInfo: additional fix missed in `bc066e2`.	2021-09-03 15:28:00 -07:00
David Blaikie	bc066e26c9	DebugInfo: Fix a few bot failures for type dumping fixes	2021-09-03 14:08:58 -07:00
David Blaikie	40f1593558	DebugInfo: Correct/improve type formatting (pointers to function types especially) This does add some extra superfluous whitespace (eg: "int *") intended to make the Simplified Template Names work easier - this makes the DIE-based names match more exactly the clang-generated names, so it's easier to identify cases that don't generate matching names. (arguably we could change clang to skip that whitespace or add some fuzzy matching to accommodate differences in certain whitespace - but this seemed easier and fairly low-impact)	2021-09-03 12:22:28 -07:00
Jinsong Ji	343a72a24d	[NFC][CSSPGO] Add end of file newline to test input On some platform (eg: AIX), diff will complain about newline. diff: Missing newline at the end of file .../llvm/test/tools/llvm-profdata/Inputs/cs-sample.proftext.	2021-09-03 17:42:32 +00:00
Simon Pilgrim	6ba0b9f68a	[X86][SLM] Fix PBLENDVB uops and throughput SLM PBLENDVB is just as bad as BLENDVPD/PS - so model it as such, fixing the rr vs rm uops diff as well. The Intel AoM appears to have a copy+paste typo with PBLENDW, it doesn't match Agner or InstLatX64. Noticed while investigating some of the weird discrepancies reported by the D103695 helper script (SLM had much better vector shift throughputs than it should).	2021-09-03 11:31:29 +01:00
gbreynoo	e28cd75a50	[OptTable] Reapply Improve error message output for grouped short options This reapplies `71d7fed3bc` which was reverted by `3e2bd82f02`. This change includes the fix for breaking the sanitizer bots. As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current implementation for parsing grouped short options can return unclear error messages. This change fixes the example given in the ticket in which a flag is incorrectly given an argument. Also when parsing a group we now keep reading past the first incorrect option and output errors for all incorrect options in the group. Differential Revision: https://reviews.llvm.org/D108770	2021-09-03 11:13:52 +01:00
Hongtao Yu	7ca8030030	[CSSPGO] Enable loading MD5 CS profile. Adding the compiler support of MD5 CS profile based on pervious context split work D107299. A MD5 CS profile is about 40% smaller than the string-based extbinary profile. As a result, the compilation is 15% faster. There are a few conversion from real names to md5 names that have been made on the sample loader and context tracker side to get it work. Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D108342	2021-09-01 09:19:47 -07:00
Kevin Athey	3e2bd82f02	Revert "[OptTable] Improve error message output for grouped short options" This reverts commit `71d7fed3bc`. Reason: broke sanitizer bots more info: https://reviews.llvm.org/D108770	2021-08-31 14:06:11 -07:00
wlei	964053d56f	[llvm-profgen] Support LBR only perf script This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation. A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info. An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`) ``` 40062f 0x40062f/0x4005b0/P/-/-/9 0x400645/0x4005ff/P/-/-/1 0x400637/0x400645/P/-/-/1 ... 4005d7 0x4005d7/0x4005e5/P/-/-/8 0x40062f/0x4005b0/P/-/-/6 0x400645/0x4005ff/P/-/-/1 ... ... ``` For implementation: - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer. - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded. - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key. - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output. Profile generation part will come soon. Differential Revision: https://reviews.llvm.org/D108153	2021-08-31 13:28:17 -07:00
gbreynoo	71d7fed3bc	[OptTable] Improve error message output for grouped short options As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current implementation for parsing grouped short options can return unclear error messages. This change fixes the example given in the ticket in which a flag is incorrectly given an argument. Also when parsing a group we now keep reading past the first incorrect option and output errors for all incorrect options in the group. Differential Revision: https://reviews.llvm.org/D108770	2021-08-31 16:41:08 +01:00
Simon Pilgrim	7ec7272b80	[MCA][X86] Add basic coverage for icelake arch Copy the skylake-avx512 tests for icelake-server coverage. Add icelake/rocketlake/tigerlake test coverage to the relevent generic tests as well.	2021-08-31 12:20:09 +01:00
Hongtao Yu	b9db70369b	[CSSPGO] Split context string to deduplicate function name used in the context. Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context. A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing. When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`. Testing This is no-diff change in terms of code quality and profile content (for text profile). For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated. The compile time of ads is dropped by 25%. Differential Revision: https://reviews.llvm.org/D107299	2021-08-30 20:09:29 -07:00
Keith Smiley	b5da3120b8	[llvm-cov][NFC] Add test for coverage-prefix-map remappings This test covers acts as a regression test for these fixes: `c75a0a1e9d` `dd388ba3e0` Differential Revision: https://reviews.llvm.org/D108805	2021-08-30 17:19:57 -07:00
Haowei Wu	31e61c58b0	[ifs] Add option to hide undefined symbols This change add an option to llvm-ifs to hide undefined symbols from its output. Differential Revision: https://reviews.llvm.org/D108428	2021-08-27 11:15:56 -07:00
Roman Lebedev	d4d459e747	[X86] AMD Zen 3: MULX w/ mem operand has the same throughput as with reg op Exegesis is faulty and sometimes when measuring throughput^-1 produces snippets that have loop-carried dependencies, which must be what caused me to incorrectly measure it originally. After looking much more carefully, the inverse throughput should match that of the MULX w/ reg op. As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	0f04936a2d	[X86] AMD Zen 3: MULX produces low part of the result in 3cy, +1cy for high part As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	db2c6cd99c	[NFC][X86][MCA] AMD Zen 3: improve MULX test coverage Latency for MULX isn't right	2021-08-27 13:27:05 +03:00
Andrea Di Biagio	4a5b191703	[X86][MCA] Address the latest issues with MULX reported in PR51495. It turns out that SchedWrite WriteIMulH was always assigned to the low half of the result of a MULX (rather than to the high half). To avoid confusion, this patch swaps the two MULX writes in the tablegen definition of MULX32/64. That way, write names better describe what they actually refer to; this also avoids further complications if in future we decide to reuse the same MulH writes to also model other scalar integer multiply instructions. I also had to swap the latency values for the two MULX writes to make sure that the change is effectively an NFC. In fact, none of the existing x86 tests were affected by this small refactoring. This patch also fixes a bug in MCA: a wrong latency value was propagated for instructions that perform multiple writes to a same register. This last issue was found by Roman while testing MULX on targets that define a different latency for the Low/High part of the result. Differential Revision: https://reviews.llvm.org/D108727	2021-08-26 12:08:20 +01:00
David Green	6ffc6951a3	[AArch64] Remove unpredictable from narrowing instructions. Like other similar instructions the xtn2 family do not have side effects, and explicitly marking them as such can help improve scheduling freedom.	2021-08-26 09:43:44 +01:00
David Green	9474b03d41	[AArch64] Add a Cortex-A55 NEON scheduler test case.	2021-08-26 09:43:44 +01:00
Esme-Yi	b21ed75e10	[llvm-readobj][XCOFF] Add support for `--needed-libs` option. Summary: This patch is trying to add support for llvm-readobj --needed-libs option under XCOFF. For XCOFF, the needed libraries can be found from the Import File ID Name Table of the Loader Section. Currently, I am using binary inputs in the test since yaml2obj does not yet support for writing the Loader Section and the import file table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D106643	2021-08-26 07:17:06 +00:00
Fangrui Song	4a66a11286	[LLVMgold.so][test] Make comdat-nodeduplicate.ll work with binutils<2.27	2021-08-25 16:59:06 -07:00
Andrea Di Biagio	6181427bb9	[X86][MCA] Add more tests for MULX (PR51495). llvm-mca still reports a wrong latency for the case where the two destination registers of MULX are the same.	2021-08-25 21:28:21 +01:00
Alfonso Sánchez-Beato	cdd407286a	[llvm-objcopy] [COFF] Consider section flags when adding section The --set-section-flags option was being ignored when adding a new section. Take it into account if present. Fixes https://llvm.org/PR51244 Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D106942	2021-08-25 23:11:41 +03:00
Rong Xu	24201b6437	[SampleFDO] Set ProfileIsFS bit properly from the internal option We have "-profile-isfs" internal option for text, binary, and compactbinary format (mostly for debug and test purpose). We need to set the related flag in FunctionSamples so that ProfileIsFS is written to the header in extbinary format. Differential Revision: https://reviews.llvm.org/D108707	2021-08-25 09:07:34 -07:00
Wenlei He	a6f15e9a49	[CSSPGO] Use probe inline tree to track zero size fully optimized context for pre-inliner This is a follow up diff for BinarySizeContextTracker to track zero size for fully optimized inlinee. When an inlinee is fully optimized away, we won't be able to get its size through symbolizing instructions, hence we will treat the corresponding context size as unknown. However by traversing the inlined probe forest, we know what're original inlinees regardless of optimization. If a context show up in inlined probes, but not during symbolization, we know that it's fully optimized away hence its size is zero instead of unknown. It should provide more accurate size cost estimation for pre-inliner to make better inline decisions in llvm-profgen. Differential Revision: https://reviews.llvm.org/D108350	2021-08-25 09:01:11 -07:00
Andrea Di Biagio	5f848b311f	[X86][SchedModel] Fix latency the Hi register write of MULX (PR51495). Before this patch, WriteIMulH reported a latency value which is correct for the RR variant of MULX, but not for the RM variant. This patch fixes the issue by introducing a new WriteIMulHLd, which is meant to be used only by the RM variant of MULX. Differential Revision: https://reviews.llvm.org/D108701	2021-08-25 16:12:09 +01:00
Vyacheslav Zakharin	2e192ab1f4	[CodeExtractor] Preserve topological order for the return blocks. Differential Revision: https://reviews.llvm.org/D108673	2021-08-25 08:09:01 -07:00
Andrea Di Biagio	fe13b81ed9	[X86][NFC] Pre-commit llvm-mca tests for PR51495. WriteIMulH reports an incorrect latency for RM variants of MULX.	2021-08-25 14:17:17 +01:00
Fangrui Song	9b96b0865d	llvm-xray {convert,extract}: Add --demangle No demangling may be a better default in the future. Add `--demangle` for migration convenience. Reviewed By: Enna1 Differential Revision: https://reviews.llvm.org/D108100	2021-08-24 13:35:19 -07:00
Patrick Holland	e4ebfb5786	[MCA] Adding an AMDGPUCustomBehaviour implementation. This implementation allows mca to model the desired behaviour of the s_waitcnt instruction. This patch also adds the RetireOOO flag to the AMDGPU instructions within the scheduling model. This flag is only used by mca and allows instructions to finish out-of-order which helps mca's simulations more closely model the actual device. Differential Revision: https://reviews.llvm.org/D104730	2021-08-24 13:33:58 -07:00
Arthur Eubanks	d2e103644b	[llvm-reduce] Remove various module data This removes the data layout, target triple, source filename, and module identifier when possible. Reviewed By: swamulism Differential Revision: https://reviews.llvm.org/D108568	2021-08-24 09:45:31 -07:00
David Green	50f4ae58eb	[AArch64] Correct store ReadAdrBase operand It appears that the Read operand for stores was being placed on the first operand (the stored value) not the address base. This adds a ReadST for the stored value operand, allowing the ReadAdrBase to correctly act upon the address. Differential Revision: https://reviews.llvm.org/D108287	2021-08-23 21:07:55 +01:00
David Green	955c9437fd	[AArch64] Add Scheduling tests for Load/Store ReadAdv operands.	2021-08-23 21:07:55 +01:00
Alexey Lapshin	07d44cc0b1	[DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent. verifyDieRanges function checks for the intersected address ranges. It adds child DieRangeInfo into parent DieRangeInfo to check whether children have overlapping address ranges. It is safe to not add DieRangeInfo with empty address range into parent's children list. This decreases the number of children which should be navigated and as a result decreases execution time(parents having a lot of children with empty ranges spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl" execution time decreased from 220 sec till 75 sec. Differential Revision: https://reviews.llvm.org/D107554	2021-08-22 19:39:21 +03:00
Christian Fetzer	9116211d18	[Coverage][llvm-cov] Correctly export branch coverage in LCOV format Commit `9f2967bcfe` introduced support for branch coverage including export to the LCOV format. This commit corrects the LCOV field name for branches from BFH to BRH. The mistake seems to have slipped in as typo because the correct field name BRH is used in the comment section at the beginning of the file. Differential Revision: https://reviews.llvm.org/D108358	2021-08-20 13:44:25 -05:00
Andrea Di Biagio	35d4292a73	[X86][SchedModels] Fix missing ReadAdvance for MULX and ADCX/ADOX (PR51494) Before this patch, instructions MULX32rm and MULX64rm were missing a ReadAdvance for the implicit read of register EDX/RDX. This patch fixes the issue, and it also introduces a new SchedWrite for the two variants of MULX. The general idea behind this last change is to eventually decrease the number of InstRW in the scheduling models. This patch also adds a ReadAdvance for the implicit read of EFLAGS in ADCX/ADOX. Differential Revision: https://reviews.llvm.org/D108372	2021-08-20 17:39:51 +01:00
Maryam Benimmar	2cdfd0b259	[AIX][XCOFF] 64-bit relocation reading support Support XCOFFDumper relocation reading support This patch is part of D103696 partition Reviewed By: daltenty, Helflym Differential Revision: https://reviews.llvm.org/D104646	2021-08-19 21:56:57 -04:00
Andrzej Warzynski	dcc6b7b1d5	[OptTable] Refine how `printHelp` treats empty help texts Currently, `printHelp` behaves differently for options that: * do not define `HelpText` (such options _are not printed_), and * define its `HelpText` as `HelpText<"">` (such options _are printed_). In practice, both approaches lead to no help text and `printHelp` should treat them consistently. This patch addresses that by making `printHelpt` check the length of the help text to be printed. All affected tests have been updated accordingly. The option definitions for llvm-cvtres have been updated with a short description or "Not implemented" for options that are ignored by the tool. Differential Revision: https://reviews.llvm.org/D107557	2021-08-19 09:30:15 +00:00
Wenlei He	eca03d2768	[CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions. To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context. The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed. Differential Revision: https://reviews.llvm.org/D108180	2021-08-18 22:50:57 -07:00
Andrea Di Biagio	2d53e54f0e	[X86][NFC] Pre-commit tests for PR51494	2021-08-18 19:55:21 +01:00
Maryam Benimmar	7151a8aada	[PowerPC][AIX] llvm-readobj: Convert some errors to warnings. Report warnings rather than errors, so that llvm-readobj doesn't bail out on malformed inputs. Differential Revision: https://reviews.llvm.org/D106783	2021-08-18 11:04:08 -04:00
Xu Mingjie	168ee72718	[NFC][llvm-xray] add a llvm-xray convert option `no-demangle` When option `--symbolize` is true, llvm-xray convert will demangle function name on default. This patch adds a llvm-xray convert option `no-demangle` to determine whether to demangle function name when symbolizing function ids from the input log. Reviewed By: MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D108019	2021-08-18 12:22:04 +08:00
Jozef Lawrynowicz	108ba4f4a4	[llvm-readobj] Refactor ELFDumper::printAttributes() The current implementation of printAttributes makes it fiddly to extend attribute support for new targets. By refactoring the code so all target specific variables are initialized in a switch/case statement, it becomes simpler to extend attribute support for new targets. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D107968	2021-08-17 13:28:31 -07:00
Tozer	6d5e31baaa	Fix 2: [MCParser] Correctly handle CRLF line ends when consuming line comments Fixes an issue with revision `5c6f748c` and `ad40cb88`. Adds an mcpu argument to the test command, preventing an invalid default CPU from being used on some platforms.	2021-08-17 17:13:21 +01:00
Fangrui Song	c56b4cfd4b	[llvm-objdump] -T: print symbol versions Similar to D94907 (llvm-nm -D). The output will match GNU objdump 2.37. Older versions don't use ` (version)` for undefined symbols. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D108097	2021-08-17 09:10:50 -07:00
Tozer	ad40cb8821	Fix: [MCParser] Correctly handle CRLF line ends when consuming line comments Fixes an issue with revision `5c6f748c`. Move the test added in the above commit into the X86 folder, ensuring that it is only run on targets where its triple is valid.	2021-08-17 16:16:19 +01:00
Tozer	5c6f748cbc	[MCParser] Correctly handle CRLF line ends when consuming line comments Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=47983 The AsmLexer currently has an issue with lexing line comments in files with CRLF line endings, in which it reads the carriage return as being part of the line comment. This causes an error for certain valid comment layouts; this patch fixes this by excluding the carriage return from the line comment. Differential Revision: https://reviews.llvm.org/D90234	2021-08-17 15:52:51 +01:00
Fangrui Song	54e76cb17a	[split-file] Default to --no-leading-lines It turns out that the --leading-lines may be a bad default. [[#@LINE+-num]] is rarely used.	2021-08-16 19:23:11 -07:00
Hongtao Yu	f27fee623d	[SamplePGO][NFC] Dump function profiles in order Sample profiles are stored in a string map which is basically an unordered map. Printing out profiles by simply walking the string map doesn't enforce an order. I'm sorting the map in the decreasing order of total samples to enable a more stable dump, which is good for comparing two dumps. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D108147	2021-08-16 17:22:30 -07:00
Fangrui Song	935a6d4024	[test] Change llvm-xray options to use the preferred double-dash forms and change -f= to -f	2021-08-15 21:19:04 -07:00
David Blaikie	44d0a99a12	Add missing triple for test	2021-08-15 12:32:12 -07:00
David Blaikie	62a4c2c10e	DWARFVerifier: Check section-relative references at the end of the section This ensures that debug_types references aren't looked for in debug_info section. Behavior is still going to be questionable in an unlinked object file - since cross-cu references could refer to symbols in another .debug_info (or, in theory, .debug_types) chunk - but if a producer only uses ref_addr to refer to things within the same .debug_info chunk in an object file (eg: whole program optimization/LTO - producing two CUs into a single .debug_info section in an object file - the ref_addrs there could be resolved relative to that .debug_info chunk, not needing to consider comdat (DWARFv5 type units or other creatures) chunks of .debug_info, etc)	2021-08-15 11:40:24 -07:00
David Blaikie	2af4db7d5c	Migrate DWARFVerifier tests to lit-based yaml instead of gtest with embedded yaml Improves maintainability (edit/modify the tests without recompiling) and error messages (previously the failure would be a gtest failure mentioning nothing of the input or desired text) and the option to improve tests with more checks. (maybe these tests shouldn't all be in separate files - we could probably have DWARF yaml that contains multiple errors while still being fairly maintainable - the various invalid offsets (ref_addr, rnglists, ranges, etc) could probably be all in one test, but for the simple sake of the migration I just did the mechanical thing here)	2021-08-13 19:09:41 -07:00
Vyacheslav Zakharin	15497e62f6	[openmp][ELF] Recognize LLVM OpenMP offload specific notes The new ELF notes are added in clang-offload-wrapper, and llvm-readobj has to visualize them properly. Differential Revision: https://reviews.llvm.org/D99552	2021-08-12 13:47:48 -07:00
Igor Kudrin	68616584c3	[llvm-objcopy][ELF] Avoid reordering section headers As for now, llvm-objcopy sorts section headers according to the offsets of the sections in the input file. That can corrupt section references in the dynamic symbol table because it is a loadable section and as such is not updated by the tool. Even though the section references are not required for loading the binary correctly, they are still handy for a user who analyzes the file. While the patch removes global reordering of section headers, it layouts the sections in the same way as before, i.e. according to their original offsets. All that helps the output file to resemble the input better. Note that the patch removes sorting SHT_GROUP sections to the start of the list, which was introduced in D62620 in order to ensure that they come before the group members, along with the corresponding test. The original issue was caused by the sorting of section headers, so dropping the sorting also resolves the issue. Differential Revision: https://reviews.llvm.org/D107653	2021-08-12 17:12:09 +07:00
wlei	856a6a5041	[CSSPGO][llvm-profgen] Trim and merge context beforehand to reduce memory usage Currently we use a centralized string map(StringMap<FunctionSamples> ProfileMap) to store the profile while populating the sample, which might cause the memory usage bottleneck. I saw in an extreme case, there are thousands of samples whose context stack depth is >= 100. The memory consumption can be greater than 100GB. As here the context is used for inlining, we can assume we won't have so many of inlinees keeping inlined at the same root function, so this change tried to cap the context stack and merge the samples for peak memory reduction and this is done after recursion compression. The default value is -1 meaning no depth limit, in the future we can tune to a smaller one. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D107800	2021-08-11 16:02:35 -07:00
Fangrui Song	76093b1739	[InlineAdvisor] Add single quotes around caller/callee names Clang diagnostics refer to identifier names in quotes. This patch makes inline remarks conform to the convention. New behavior: ``` % clang -O2 -Rpass=inline -Rpass-missed=inline -S a.c a.c:4:25: remark: 'foo' inlined into 'bar' with (cost=-30, threshold=337) at callsite bar:0:25; [-Rpass=inline] int bar(int a) { return foo(a); } ^ ``` Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D107791	2021-08-10 11:51:31 -07:00
Ben Dunbobbin	9e4d2b193a	[llvm-ar] Add some test-cases for empty archives We had coverage of empty archive in our downstream testsuite. This adds those cases upstream. Differential Revision: https://reviews.llvm.org/D107471	2021-08-10 10:34:50 +01:00
Esme-Yi	f49c3a6882	[llvm-readobj][XCOFF] Print the length of the string table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D107333	2021-08-09 06:47:15 +00:00
Pirama Arumuga Nainar	16ebb7ab5c	[llvm-objcopy] [COFF] Do not patch debug entries if PointerToRawData is zero Fix an edge case missed by https://reviews.llvm.org/D78921. For e.g., the Repro debug entry (generated with the /Brepro linker flag) does not have a debug-directory payload. Do not attempt to patch Debug entries without a payload. Differential Revision: https://reviews.llvm.org/D107324	2021-08-06 09:23:25 -07:00
Martin Storsjö	46020f6f0c	[llvm-rc] Allow specifying language with a leading 0x prefix This option is always interpreted strictly as a hexadecimal string, even if it has no prefix that indicates the number format, hence the existing call to StringRef::getAsInteger(16, ...). StringRef::getAsInteger(0, ...) consumes a leading "0x" prefix is present, but when the radix is specified, the radix shouldn't be included. Both MS rc.exe and GNU windres accept the language with that prefix. Also allow specifying the codepage to llvm-windres with a different radix, as GNU windres allows that (but MS rc.exe doesn't). This fixes https://llvm.org/PR51295. Differential Revision: https://reviews.llvm.org/D107263	2021-08-05 10:19:55 +03:00
Igor Kudrin	2c14798ead	[ARM][llvm-objdump] Annotate PC-relative memory operands of VLDR instructions This extends D105979 and adds support for VLDR instructions. Differential Revision: https://reviews.llvm.org/D105980	2021-08-05 14:11:11 +07:00
Igor Kudrin	ddbe812bcc	[ARM][llvm-objdump] Annotate PC-relative memory operands This implements `MCInstrAnalysis::evaluateMemoryOperandAddress()` for Arm so that the disassembler can print the target address of memory operands that use PC+immediate addressing. Differential Revision: https://reviews.llvm.org/D105979	2021-08-05 14:11:11 +07:00
Andrea Di Biagio	7a1a35a1d1	[X86][SchedModel] Add missing ReadAdvance for some arithmetic ops (PR51318 and PR51322). This fixes a bug where implicit uses of EFLAGS were not marked as ReadAdvance in the RM/MR variants of ADC/SBB (PR51318) This also fixes the absence of ReadAdvance for the register operand of RMW arithmetic instructions (PR51322). Differential Revision: https://reviews.llvm.org/D107367	2021-08-04 17:50:22 +01:00
Esme-Yi	737e27f623	[llvm-readobj][XCOFF] dump the string table only if the size is bigger than 4.	2021-08-04 06:28:26 +00:00
Vitaly Buka	3df1e7e6f0	[llvm-readobj][XCOFF] Warn about invalid offset Followup for D105522 Differential Revision: https://reviews.llvm.org/D107398	2021-08-03 20:11:26 -07:00
wlei	f1affe8dc8	[llvm-profgen][CSSPGO] Support count based aggregated type of hybrid perf script This change tried to integrate a new count based aggregated type of perf script. The only difference of the format is that an aggregated count is added at the head of the original sample which means the same samples are repeated to the given count times. This is used to reduce the perf script size. e.g. ``` 2 4005dc 400634 400684 7f68c5788793 0x4005c8/0x4005dc/P/-/-/0 .... ``` Implemented by a dedicated PerfReader `AggregatedHybridPerfReader`. Differential Revision: https://reviews.llvm.org/D107192	2021-08-03 17:56:35 -07:00
wlei	fe3ba90830	[llvm-profgen] Support perf script without parsing MMap events This change supports to run without parsing MMap binary loading events instead it always assumes binary is loaded at the preferred address. This is used when we have assured no binary load address changes or we have pre-processed the addresses resolution. Warn if there's interior mmap event but without leading mmap events. Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D107097	2021-08-03 10:01:07 -07:00
Andrea Di Biagio	f0658c7a42	[MCA][NFC] Add tests for PR51318 and PR51322. Also, regenerate existing X86 tests using update_mca_test.py.	2021-08-03 17:06:34 +01:00
Jason Molenda	0d8cd4e2d5	[AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted Add a comment when there is a shifted value, add x9, x0, #291, lsl #12 ; =1191936 but not when the immediate value is unshifted, subs x9, x0, #256 ; =256 when the comment adds nothing additional to the reader. Differential Revision: https://reviews.llvm.org/D107196	2021-08-03 02:28:46 -07:00
Esme-Yi	69396896fb	[llvm-readobj][XCOFF] Fix the error dumping for the first item of StringTable. Summary: For the string table in XCOFF, the first 4 bytes contains the length of the string table, so we should print the string entries from fifth bytes. This patch also adds tests for llvm-readobj dumping the string table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D105522	2021-08-03 09:08:58 +00:00
Eli Friedman	2a2847823f	[ConstantFold] Get rid of special cases for sizeof etc. Target-dependent constant folding will fold these down to simple constants (or at least, expressions that don't involve a GEP). We don't need heroics to try to optimize the form of the expression before that happens. Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 . Differential Revision: https://reviews.llvm.org/D107116	2021-07-31 13:20:47 -07:00
Petr Hosek	83302c8489	[profile] Fix profile merging with binary IDs This fixes support for merging profiles which broke as a consequence of `e50a38840d`. The issue was missing adjustment in merge logic to account for the binary IDs which are now included in the raw profile just after header. In addition, this change also: * Includes the version in module signature that's used for merging to avoid accidental attempts to merge incompatible profiles. * Moves the binary IDs size field after version field in the header as was suggested in the review. Differential Revision: https://reviews.llvm.org/D107143	2021-07-30 18:54:27 -07:00
Petr Hosek	d3dd07e3d0	Revert "[profile] Fix profile merging with binary IDs" This reverts commit `dcadd64986`.	2021-07-30 18:53:48 -07:00
Petr Hosek	dcadd64986	[profile] Fix profile merging with binary IDs This fixes support for merging profiles which broke as a consequence of `e50a38840d`. The issue was missing adjustment in merge logic to account for the binary IDs which are now included in the raw profile just after header. In addition, this change also: * Includes the version in module signature that's used for merging to avoid accidental attempts to merge incompatible profiles. * Moves the binary IDs size field after version field in the header as was suggested in the review. Differential Revision: https://reviews.llvm.org/D107143	2021-07-30 17:38:53 -07:00
Fangrui Song	a1532ed275	[InstrProfiling] Make CountersPtr in __profd_ relative Change `CountersPtr` in `__profd_` to a label difference, which is a link-time constant. On ELF, when linking a shared object, this requires that `__profc_` is either private or linkonce/linkonce_odr hidden. On COFF, we need D104564 so that `.quad a-b` (64-bit label difference) can lower to a 32-bit PC-relative relocation. ``` # ELF: R_X86_64_PC64 (PC-relative) .quad .L__profc_foo-.L__profd_foo # Mach-O: a pair of 8-byte X86_64_RELOC_UNSIGNED and X86_64_RELOC_SUBTRACTOR .quad l___profc_foo-l___profd_foo # COFF: we actually use IMAGE_REL_AMD64_REL32/IMAGE_REL_ARM64_REL32 so # the high 32-bit value is zero even if .L__profc_foo < .L__profd_foo # As compensation, we truncate CountersDelta in the header so that # __llvm_profile_merge_from_buffer and llvm-profdata reader keep working. .quad .L__profc_foo-.L__profd_foo ``` (Note: link.exe sorts `.lprfc` before `.lprfd` even if the object writer has `.lprfd` before `.lprfc`, so we cannot work around by reordering `.lprfc` and `.lprfd`.) With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`) `ld -pie` linked clang is 1.74% smaller due to fewer R_X86_64_RELATIVE relocations. ``` % readelf -r pie \| awk '$3~/R.*/{s[$3]++} END {for (k in s) print k, s[k]}' R_X86_64_JUMP_SLO 331 R_X86_64_TPOFF64 2 R_X86_64_RELATIVE 476059 # was: 607712 R_X86_64_64 2616 R_X86_64_GLOB_DAT 31 ``` The absolute function address (used by llvm-profdata to collect indirect call targets) can be converted to relative as well, but is not done in this patch. Differential Revision: https://reviews.llvm.org/D104556	2021-07-30 11:52:18 -07:00
Esme-Yi	8011fc1953	[yaml2obj] Enable support for parsing 64-bit XCOFF. Summary: Add support for yaml2obj to parse 64-bit XCOFF. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D100375	2021-07-30 02:06:04 +00:00
Andrew Savonichev	bcc83a2e83	[MCA] Use LSU for the in-order pipeline Load/Store unit is used to enforce order of loads and stores if they alias (controlled by --noalias=false option). Fixes PR50483 - [MCA] In-order pipeline doesn't track memory load/store dependencies. Differential Revision: https://reviews.llvm.org/D103955	2021-07-29 14:40:23 +03:00
Sebastian Neubauer	4864893127	[Utils] Do not remove comments in llc test script When checking if two prefixes can be merged for a function, update_llc_test_checks.py removed IR comments before comparing llc outputs of different RUN lines. This means, if one RUN line emited lines starting with ';' and another RUN line emited the same lines except the ones starting with ';', both RUNs would be merged (if they share a prefix). However, CHECK-NEXT lines check the comments, otherwise they fail, so the script should not merge RUNs if they contain different comments. Differential Revision: https://reviews.llvm.org/D101312	2021-07-29 13:03:05 +02:00
Nathan Chancellor	5060224d9e	[test] Fix tools/gold/X86/comdat-nodeduplicate.ll on non-X86 hosts When running this test on an aarch64 machine, it fails: ``` /usr/bin/ld.gold: error: .../test/tools/gold/X86/Output/comdat-nodeduplicate.ll.tmp/ab.lto.o: incompatible target ``` Specify the elf_x86_64 emulation as all of the other gold plugin tests do. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107020	2021-07-28 21:56:23 -07:00
Daniel Rodríguez Troitiño	d6704e5ed9	[llvm-objcopy][MachO] Ignore all LC_SUB_* commands. The LC_SUB_FRAMEWORK, LC_SUB_UMBRELLA, LC_SUB_CLIENT, and LC_SUB_LIBRARY are used to indicate related libraries, binaries or framework names. Their only payload is the string with the name of the object. Adding those commands to the list of ignored/skipped load commands will avoid an error that stop the process of copying/stripping and will copy their contents verbatim. Additionally, in order to have a test for this case, `yaml2obj` now allows those four commands to contain a `Content`. Differential Revision: https://reviews.llvm.org/D106412	2021-07-28 17:35:26 -07:00
Eli Friedman	4adcff0b70	[ARM] Fix llvm-objdump disassembly of armv7m object files. Apparently, the features were getting mixed up, so we'd try to disassemble in ARM mode. Fix sub-architecture detection to compute the correct triple if we're detecting it automatically, so the user doesn't need to pass --triple=thumb etc. It's possible we should be somehow tying the "+thumb-mode" target feature more directly to Tag_CPU_arch_profile? But this seems to work reasonably well, anyway. While I'm here, fix up the other llvm-objdump tests that were explicitly specifying an ARM triple; that shouldn't be necessary. Differential Revision: https://reviews.llvm.org/D106912	2021-07-28 11:41:54 -07:00
Wael Yehia	9559bd1990	[LTO][Legacy] Add new API to check presence of ctor/dtor functions. On AIX, the linker needs to check whether a given lto_module_t contains any constructor/destructor functions, in order to implement the behavior of the -bcdtors:all flag. See https://www.ibm.com/docs/en/aix/7.2?topic=l-ld-command for the flag's documentation. In llvm IR, constructor (destructor) functions are added to a special global array @llvm.global_ctors (@llvm.global_dtors). However, because these two symbols are artificial, they are not visited during the symbol traversal (using the lto_module_get_[num_symbols\|symbol_name\|symbol_attribute] API). This patch adds a new function to the libLTO interface that checks the presence of one or both of these two symbols. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D106887	2021-07-28 12:41:56 +00:00
Esme-Yi	14f6cfcf3c	[Debug-Info][llvm-dwarfdump] Don't try to dump location list for attributes that don't have the loclist class. Summary: The overflow error occurs when we try to dump location list for those attributes that do not have the loclist class, like DW_AT_count and DW_AT_byte_size. After re-reviewed the entire list, I sorted those attributes into two parts, one for dumping location list and one for dumping the location expression. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D105613	2021-07-27 07:28:59 +00:00
Fangrui Song	792c206e2b	[llvm-objcopy] Drop GRP_COMDAT if the group signature is localized See [GRP_COMDAT group with STB_LOCAL signature](https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc) objcopy PR: https://sourceware.org/bugzilla/show_bug.cgi?id=27931 GRP_COMDAT deduplication is purely based on the signature symbol name in ld.lld/GNU ld/gold. The local/global status is not part of the equation. If the signature symbol is localized by --localize-hidden or --keep-global-symbol, the intention is likely to make the group fully localized. Drop GRP_COMDAT to suppress deduplication. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D106782	2021-07-26 09:05:18 -07:00
Fangrui Song	c0da287c30	[yaml2obj][MachO] Rename PayloadString to Content The new name is conciser and matches yaml2obj ELF & DWARF. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D106759	2021-07-26 09:04:51 -07:00
gbreynoo	87ed73fe6e	[llvm-readobj] Display multiple function names for stack size entries The current implementation of displaying .stack_size information presumes that each entry represents a single function but this is not always the case. For example with the use of ICF multiple functions can be represented with the same code, meaning that the address found in a .stack_size entry corresponds to multiple function symbols. This change allows multiple function names to be displayed when appropriate. Differential Revision: https://reviews.llvm.org/D105884	2021-07-26 14:49:53 +01:00
Martin Storsjö	0a1683f8cc	[llvm-rc] Allow dashes as part of resource name strings This matches what MS rc.exe allows in practice. I'm not aware of any legal syntax case that are broken by allowing dashes as part of what the tokenizer considers an Identifier - but I'm not very well versed in the RC syntax either, can @amccarth think of any case that would be broken by this? This fixes downstream bug https://github.com/msys2/MINGW-packages/issues/9180. Additionally, rc.exe allows such resource name strings to be surrounded by quotes, ending up with e.g. Resource name (string): "QUOTEDNAME" (i.e., the quotes end up as part of the string), which llvm-rc doesn't support yet either. (I'm not aware of such cases in the wild though, but resource string names with dashes do exist.) This also allows including files with unquoted paths, with filenames containing dashes (which fixes https://github.com/msys2/MINGW-packages/issues/9130, which has been worked around differently so far). Differential Revision: https://reviews.llvm.org/D106598	2021-07-23 23:05:20 +03:00
Fangrui Song	31677c6481	[llvm-symbolizer] Remove one-dash long options Most modern tools only accept two-dash long options. Remove one-dash long options which are not recognized by GNU style `getopt_long`. This ensures long options cannot collide with grouped short options. Note: llvm-symbolizer has `-demangle={true,false}` for pprof compatibility (for a while). They are kept. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D106377	2021-07-23 08:35:45 -07:00
Gulfem Savrun Yeniceri	e50a38840d	[profile] Add binary id into profiles This patch adds binary id into profiles to easily associate binaries with the corresponding profiles. There is an RFC that discusses the motivation, design and implementation in more detail: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html Differential Revision: https://reviews.llvm.org/D102039	2021-07-23 00:19:12 +00:00
Eric Astor	a4e964a282	[ms] [llvm-ml] Fix macro case-insensitivity We previously had issues identifying macros not registered with a lowercase name. Reviewed By: mstorsjo, thakis Differential Revision: https://reviews.llvm.org/D106453	2021-07-22 15:50:52 -04:00
Simon Pilgrim	d073b19dbf	[X86] Fix SLM FP<->INT throughputs. Noticed while trying to clean up the shift costs model for SSE4 targets using the script in D10369 - SLM double-pumps all the 128-bit vector conversion ops and only use FP0 pipe - numbers taken from Intel AOM + Agner.	2021-07-22 19:39:04 +01:00
Timm Bäder	924d62ca4a	[llvm][tools] Hide remaining unrelated llvm- tool options Differential Revision: https://reviews.llvm.org/D106430	2021-07-22 09:47:55 +02:00
Bill Wendling	635288d215	[llvm-diff] Check for recursive initialiers We need to check for recursive initializers in the "ConstantStruct" case. Differential Revision: https://reviews.llvm.org/D105616	2021-07-21 14:21:21 -07:00
Gulfem Savrun Yeniceri	fd895bc81b	Revert "[profile] Add binary id into profiles" Revert "[profile] Change linkage type of a compiler-rt func" This reverts commits `f984ac2715` and `467c719124` because it broke some builds.	2021-07-21 19:15:18 +00:00
Gulfem Savrun Yeniceri	f984ac2715	[profile] Add binary id into profiles This patch adds binary id into profiles to easily associate binaries with the corresponding profiles. There is an RFC that discusses the motivation, design and implementation in more detail: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html Differential Revision: https://reviews.llvm.org/D102039	2021-07-21 17:55:43 +00:00
Eric Astor	69551486fd	[ms] [llvm-ml] Restrict implicit RIP-relative addressing to named-variable references ML64.EXE applies implicit RIP-relative addressing only to memory references that include a named-variable reference. Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D105372	2021-07-21 11:49:58 -04:00
Eric Astor	5fba605896	[ms] [llvm-ml] Support built-in text macros Add support for all built-in text macros supported by ML64: @Date, @Time, @FileName, @FileCur, and @CurSeg. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104965	2021-07-21 11:44:09 -04:00
Eric Astor	4cbb912d75	[ms] [llvm-ml] Add support for numeric built-in symbols Support @Version and @Line as built-in symbols. For now, resolves @Version to 1427 (the same as for the VS 2019 release of ML.EXE). Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104964	2021-07-21 11:43:07 -04:00

... 3 4 5 6 7 ...

5669 Commits