llvm-project

Commit Graph

Author	SHA1	Message	Date
Noah Shutty	e2ad4f1756	[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer. Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`. Fixed a cast of Erorr::success() to Expected<> in debuginfod library. Added Debuginfod to Symbolize deps in gn. Adds new symbolizer symbols to `global_symbols.txt`. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D113717	2021-12-10 00:23:00 +00:00
Haowei Wu	5e171cebd3	[ifs] Add options to allow llvm-ifs to generate multiple outputs This change adds options to llvm-ifs to allow it to generate multiple types of stub files at a single invocation. Differential Revision: https://reviews.llvm.org/D115024	2021-12-09 12:25:51 -08:00
wlei	7d62b68abc	[llvm-profgen] remove check Attributes to fix build failure	2021-12-08 14:13:14 -08:00
wlei	057b0430af	[llvm-profgen] fix build failure in cs-extbinary.test	2021-12-08 13:18:26 -08:00
wlei	5d66113afc	[llvm-profgen] fix to use profile-summary-hot-count instead of profile-summary-cold-count for CS profile	2021-12-08 12:54:07 -08:00
wlei	484a569eea	[llvm-profgen] Fix total samples related issues Since total sample and body sample are used to compute hotness threshold in compiler, we found in some services changing the total samples computation will cause noticeable regression. Hence, here we will revert the changes and just keep all total samples number identical to the old tool. Three changes in this diff: 1. Revert previous diff(https://reviews.llvm.org/D112672: [llvm-profgen] Update total samples by accumulating all its body samples) and put it under a switch. 2. Keep the negative line number. Although compiler doesn't consume the count but it will be used to compute hot threshold. 3. Change to accumulate total samples per byte instead of per instruction. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D115013	2021-12-08 12:33:41 -08:00
wlei	27cb3707db	[llvm-profgen] Trim cold function profiles for non-CS AutoFDO This change allows to trim the profile if it's considered to be cold for baseline AutoFDO. We reuse the cold threshold from `ProfileSummaryBuilder::getColdCountThreshold(..)` which can be set by percent(--profile-summary-cutoff-cold) or by value(--profile-summary-cold-count). Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D113785	2021-12-08 12:20:50 -08:00
Noah Shutty	aaec63d2a7	Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer." This reverts commit `02cc8d698c` because it caused buildbot failures. The issue appears to be simply that we need to only enable debuginfod when the HTTPClient has been initialized by the running tool, since InitLLVM does not do the initialization step anymore.	2021-12-08 18:49:12 +00:00
Noah Shutty	02cc8d698c	[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer. Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D113717	2021-12-08 17:52:40 +00:00
Andrew Savonichev	420300c0d8	[MCA] Remove the warning about experimental support for in-order CPU There are not a lot of bug reports for this feature, so let's mark it stable. Differential Revision: https://reviews.llvm.org/D114701	2021-12-07 15:27:51 +03:00
Ties Stuij	63eb7ff47d	[ARM] Implement PAC return address signing mechanism for PACBTI-M This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things: - PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly Some notes: We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code. For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`. For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function. This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them. Epilogue emission code adjusted in a similar manner. PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket. note on tail calls: If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer. When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible. One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call. Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112429	2021-12-07 10:15:19 +00:00
Kristina Bessonova	0bf2c87785	[llvm-dwarfdump] Do not print preceding :: for local types Reviewed By: dblaikie, jhenderson Differential Revision: https://reviews.llvm.org/D114892	2021-12-03 12:27:29 +02:00
Paul Robinson	7bef49296e	[TLI checker] Follow good practice with -COUNT directives FileCheck's -COUNT suffix doesn't fail if there are more matches than you asked for, so it's good practice to put a -NOT after.	2021-12-02 14:28:26 -08:00
Paul Robinson	d3fe1c1583	Reapply "[TLI checker] Add more tests" This reverts commit `8cd61aac00`. I had missed one place in a test that needed updating; it passed on my dirty build tree but not on a clean one. Original commit message: D114478 identified testing gaps; this patch fills them. Differential Revision: https://reviews.llvm.org/D114913	2021-12-02 08:56:21 -08:00
Paul Robinson	8cd61aac00	Revert "[TLI checker] Add more tests" This reverts commit `2778554971`. Some bots are failing on the updated tests.	2021-12-02 08:31:27 -08:00
Paul Robinson	2778554971	[TLI checker] Add more tests D114478 identified testing gaps; this patch fills them. Differential Revision: https://reviews.llvm.org/D114913	2021-12-02 08:17:16 -08:00
Frederic Cambus	878ff1f9f8	[llvm-readobj] Add support for machine-independent NetBSD ELF core notes. Notes generated in NetBSD core files provide additional information about processes. These notes are described in core.5, which can be viewed here: https://man.netbsd.org/core.5 Differential Revision: https://reviews.llvm.org/D114635	2021-12-02 12:10:17 +01:00
Florian Hahn	ad88a37cea	[TLI] Add memset_pattern4, memset_pattern8 lib functions. Similar to memset_pattern16, memset_pattern4, memset_pattern8 are available on Darwin platforms. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html Reviewed By: ab Differential Revision: https://reviews.llvm.org/D114881	2021-12-01 21:18:19 +00:00
Paul Robinson	66071f440c	[TLI checker] Update for post-commit review comments Ignore undefined symbols; other minor code cleanup. Replace test objects and their asm source with a yaml equivalent. Differential Revision: https://reviews.llvm.org/D114478	2021-12-01 12:33:54 -08:00
Snehasish Kumar	3a4d373ec2	[memprof] Align each rawprofile section to 8b. The first 8b of each raw profile section need to be aligned to 8b since the first item in each section is a u64 count of the number of items in the section. Summary of changes: * Assert alignment when reading counts. * Update test to check alignment, relax some size checks to allow padding. * Update raw binary inputs for llvm-profdata tests. Differential Revision: https://reviews.llvm.org/D114826	2021-11-30 20:12:43 -08:00
wlei	41a681ce09	[FS-AFDO][llvm-profgen] Generate profile with FS-AFDO discriminator In order to support generating profile with FS discriminator, three kind of changes are done in llvm-profgen: 1) Dissassemble .rodata section to check if FS discriminator var ('"__llvm_fs_discriminator__"') exists and set the corresponding flag in the binary. 2) Change the discriminator decoding in `getBaseDiscriminator` and `getDuplicationFactor`. 3) set true for `FunctionSamples::ProfileIsFS` to enable FS functionality in ProfileData. Reviewed By: xur, hoy, wenlei Differential Revision: https://reviews.llvm.org/D113296	2021-11-30 15:57:59 -08:00
Snehasish Kumar	86d5dc9afc	[memprof] Disallow memprof profile reader tests on non-x86 archs. The memprof profile reader tests rely on binary data which is generated from and meant to be interpreted on little endian architectures. Add a REQUIRES: x86_64-linux clause to both tests to ensure they don't fail on big endian targets such as ppc.	2021-11-30 12:27:06 -08:00
Snehasish Kumar	7cca33b40f	[memprof] Extend llvm-profdata to display MemProf profile summaries. This commit adds initial support to llvm-profdata to read and print summaries of raw memprof profiles. Summary of changes: * Refactor shared defs to MemProfData.inc * Extend show_main to display memprof profile summaries. * Add a simple raw memprof profile reader. * Add a couple of tests to tools/llvm-profdata. Differential Revision: https://reviews.llvm.org/D114286	2021-11-30 10:45:26 -08:00
wlei	c2e08aba1a	[llvm-profgen] Compute and show profile density AutoFDO performance is sensitive to profile density, i.e., the amount of samples in the profile relative to the program size, because profiles with insufficient samples could be inaccurate due to statistical noise and thus hurt AutoFDO performance. A previous investigation showed that AutoFDO performed better on MySQL with increased amount of samples. Therefore, we implement a profile-density computation feature to give hints about profile density to users and the compiler. We define the density of a profile Prof as follows: - For each function A in the profile, density(A) = total_samples(A) / sizeof(A). - density(Prof) = min(density(A)) for all functions A that are warm (defined below). A function is considered warm if its total-samples is within top N percent of the profile. For implementation, we reuse the `ProfileSummaryBuilder::getHotCountThreshold(..)` as threshold which can be set by percent(`--profile-summary-cutoff-hot`) or by value(`--profile-summary-hot-count`). We also introduce `--hot-function-density-threshold` to set hot function density threshold and will give suggestion if profile density is below it which implies we should increase samples. This also applies for CS profile with all profiles merged into base. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D113781	2021-11-29 23:54:31 -08:00
Jeremy Morse	fc9dae420c	[DebugInfo][InstrRef][NFC] "Final" x86 test cleanup These are some final test changes for using instruction referencing on X86: * Most of these tests just have the flag switched so that they run with instr-ref, and just work: these tests were fixed by earlier patches. * There are some spurious differences in textual outputs, * A few have different temporary labels in the output because more MCSymbols are printed to the output. Differential Revision: https://reviews.llvm.org/D114588	2021-11-29 22:56:09 +00:00
Zhuo Zhang	d96f92ff16	fix typos in comments	2021-11-29 14:06:33 +01:00
Bjorn Pettersson	407600604b	[test] Use -passes in lit tests for the UpdateTestChecks tool The UpdateTestChecks tool itself does not care about which pass manager that is used in the opt invocation. So the lit tests that are verifying the behavior of the UpdateTestChecks tool is updated to use the new-PM syntax (-passes=) when specifying the pass pipeline in the test cases that are used for verifying the UpdateTestChecks tool. Differential Revision: https://reviews.llvm.org/D114517	2021-11-27 09:52:55 +01:00
Zarko Todorovski	e714394ab8	[LLVM][llvm-cov] Inclusive language: rename option -name-whitelist to -name-allowlist Renamed the option for llvm-cov and changed variable names to use more inclusive terms. Also changed the binary for the test. Reviewed By: alanphipps Differential Revision: https://reviews.llvm.org/D112816	2021-11-26 11:08:01 -05:00
Zarko Todorovski	7f7dac7126	[NFC][llvm] Inclusive language: reword uses of sanity test and check Part of continuing work to use more inclusive language. Reworded uses of sanity check and sanity test in llvm/test/	2021-11-25 07:21:42 -05:00
David Blaikie	cd93ab8947	DWARFVerifier: Don't parse all units twice Introduced/discussed in https://reviews.llvm.org/D38719 The header validation logic was also explicitly building the DWARFUnits to validate. But then other calls, like "Units.getUnitForOffset" creates the DWARFUnits again in the DWARFContext proper - so, let's avoid creating the DWARFUnits twice by walking the DWARFContext's units rather than building a new list explicitly. This does reduce some verifier power - it means that any unit with a header parsing failure won't get further validation, whereas the verifier-created units were getting some further validation despite invalid headers. I don't think this is a great loss/seems "right" in some ways to me that if the header's invalid we should stop there. Exposing the raw DWARFUnitVectors from DWARFContext feels a bit sub-optimal, but gave simple access to the getUnitForOffset to keep the rest of the code fairly similar.	2021-11-24 14:03:56 -08:00
Djordje Todorovic	e3d8ebe158	[llvm-dwarfdump][Statistics] Handle LTO cases with cross CU referencing With link-time optimizations enabled, resulting DWARF mayend up containing cross CU references (through the DW_AT_abstract_origin attribute). Consider the following example: // sum.c __attribute__((always_inline)) int sum(int a, int b) { return a + b; } // main.c extern int sum(int, int); int main() { int a = 5, b = 10, c = sum(a, b); return 0; } Compiled as follows: $ clang -g -flto -fuse-ld=lld main.c sum.c -o main Results in the following DWARF: -- sum.c CU: abstract instance tree ... 0x000000b0: DW_TAG_subprogram DW_AT_name ("sum") DW_AT_decl_file ("sum.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_type (0x000000d3 "int") DW_AT_external (true) DW_AT_inline (DW_INL_inlined) 0x000000bc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_decl_file ("sum.c") DW_AT_decl_line (1) DW_AT_type (0x000000d3 "int") 0x000000c7: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("sum.c") DW_AT_decl_line (1) DW_AT_type (0x000000d3 "int") ... -- main.c CU: concrete inlined instance tree ... 0x0000006d: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x00000000000000b0 "sum") DW_AT_low_pc (0x00000000002016ef) DW_AT_high_pc (0x00000000002016f1) DW_AT_call_file ("main.c") DW_AT_call_line (5) DW_AT_call_column (0x19) 0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg0 RAX) DW_AT_abstract_origin (0x00000000000000bc "a") 0x00000088: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg2 RCX) DW_AT_abstract_origin (0x00000000000000c7 "b") ... Note that each entry within the concrete inlined instance tree in the main.c CU has a DW_AT_abstract_origin attribute which refers to a corresponding entry within the abstract instance tree in the sum.c CU. llvm-dwarfdump --statistics did not properly report DW_TAG_formal_parameters/DW_TAG_variables from concrete inlined instance trees which had 0% location coverage and which referred to a different CU, mainly because information about abstract instance trees and their parameters/variables was stored locally - just for the currently processed CU, rather than globally - for all CUs. In particular, if the concrete inlined instance tree from the example above was to look like this (i.e. parameter b has 0% location coverage, hence why it's missing): 0x0000006d: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x00000000000000b0 "sum") DW_AT_low_pc (0x00000000002016ef) DW_AT_high_pc (0x00000000002016f1) DW_AT_call_file ("main.c") DW_AT_call_line (5) DW_AT_call_column (0x19) 0x00000081: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg0 RAX) DW_AT_abstract_origin (0x00000000000000bc "a") llvm-dwarfdump --statistics would have not reported b as such. Patch by Dimitrije Milosevic. Differential revision: https://reviews.llvm.org/D113465	2021-11-24 13:50:47 +01:00
Florian Hahn	8ef460fc51	[llvm-reduce] Add parallel chunk processing. This patch adds parallel processing of chunks. When reducing very large inputs, e.g. functions with 500k basic blocks, processing chunks in parallel can significantly speed up the reduction. To allow modifying clones of the original module in parallel, each clone needs their own LLVMContext object. To achieve this, each job parses the input module with their own LLVMContext. In case a job successfully reduced the input, it serializes the result module as bitcode into a result array. To ensure parallel reduction produces the same results as serial reduction, only the first successfully reduced result is used, and results of other successful jobs are dropped. Processing resumes after the chunk that was successfully reduced. The number of threads to use can be configured using the -j option. It defaults to 1, which means serial processing. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D113857	2021-11-24 09:23:52 +00:00
Bill Wendling	2975f37d8d	[llvm-diff] Implement diff of PHI nodes Implement diff of PHI nodes Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D114211	2021-11-22 13:23:10 -08:00
wangpc	af0ecfccae	[RISCV] Generate pseudo instruction li Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by running script `llvm/utils/update_llc_test_checks.py`. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D112692	2021-11-22 14:01:37 +08:00
Roman Lebedev	2f364f6f0d	[NFC][X86][MCA] Add forgotten test coverage for AVX512's VPMOVM2[BWDQ] / VPMOV[BWDQ]2M	2021-11-20 13:09:18 +03:00
David Blaikie	3cbc4c487a	llvm-dwarfdump: Rebuild type names in dwo type units	2021-11-18 14:12:48 -08:00
Keith Smiley	68311f21eb	[llvm-objcopy][MachO] Add llvm-strip support for newer load commands Previously llvm-strip would fail because of unknown commands. Fixes https://bugs.llvm.org/show_bug.cgi?id=50044 Differential Revision: https://reviews.llvm.org/D113734	2021-11-17 10:36:35 -08:00
Keith Smiley	693b02023e	[llvm-objdump/mac] Add support for new load commands Differential Revision: https://reviews.llvm.org/D113733	2021-11-17 09:53:25 -08:00
Jeremy Morse	1dc0e47cb9	[DebugInfo][NFC] Force some tests to not use instruction-referencing There are various tests that need to be adjusted to test the right thing with instruction referencing -- usually because the internal representation of variables is different, sometimes that location lists change. This patch makes a bunch of tests explicitly not use instruction referencing, so that a check-llvm test with instruction referencing on for x86_64 doesn't fail. I'll then convert the tests to have instr-ref CHECK lines, and similar. Differential Revision: https://reviews.llvm.org/D113194	2021-11-17 11:51:29 +00:00
Leonard Chan	b75cc51df7	Limit test to x86 for now.	2021-11-16 14:46:02 -08:00
Leonard Chan	25bcd94234	[llvm-objcopy] Add --update-section This is another attempt at D59351 which attempted to add --update-section, but with some heuristics for adjusting segment/section offsets/sizes in the event the data copied into the section is larger than the original size of the section. We are opting to not support this case. GNU's objcopy was able to do this because the linker and objcopy are tightly coupled enough that segment reformatting was simpler. This is not the case with llvm-objcopy and lld where they like to be separated. This will attempt to copy data into the section without changing any other properties of the parent segment (if the section is part of one). Differential Revision: https://reviews.llvm.org/D112116	2021-11-16 14:10:40 -08:00
Florian Hahn	28d95a2610	[llvm-reduce] Allow writing temporary files as bitcode. Textual LLVM IR files are much bigger and take longer to write to disk. To avoid the extra cost incurred by serializing to text, this patch adds an option to save temporary files as bitcode instead. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D113858	2021-11-16 12:39:42 +00:00
Wenlei He	f7976edc1e	[llvm-profgen] Add switch to allow use of first loadable segment for calculating offset Adding `-use-loadable-segment-as-base` to allow use of first loadable segment for calculating offset. By default first executable segment is used for calculating offset. The switch helps compatibility with unsymbolized profile generated from older tools. Differential Revision: https://reviews.llvm.org/D113727	2021-11-15 19:00:27 -08:00
James Henderson	254aa65d04	[llvm-nm][test] Move X86 lit.local.cfg into the X86 subfolder The file seems to have been put in the wrong place in its original commit. This had the effect of marking all llvm-nm tests as unsupported, unless X86 was enabled, even for tests that weren't X86 specific. Fixes https://bugs.llvm.org/show_bug.cgi?id=52506. Reviewed by: mstorsjo Differential Revision: https://reviews.llvm.org/D113882	2021-11-15 13:04:42 +00:00
Keith Smiley	47bb456b2f	[llvm-obcopy][MachO] Add error for MH_PRELOAD Previously this would crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=51877 Differential Revision: https://reviews.llvm.org/D113819	2021-11-12 19:18:34 -08:00
Tomasz Miąsko	c3e07df607	[llvm-nm] Demangle Rust symbols Add support for demangling Rust v0 symbols to llvm-nm by reusing nonMicrosoftDemangle which supports both Itanium and Rust mangling. Reviewed By: dblaikie, jhenderson Differential Revision: https://reviews.llvm.org/D111937	2021-11-12 12:46:59 +01:00
Arthur Eubanks	be0b47d530	[llvm-reduce] Skip replacing metadata and callee operands Metadata operands tend to require special conditions, especially on dbg intrinsics. We also don't have a zero value for metadata. Replacing callee operands is a little weird, since calling undef/null doesn't make sense. It also causes tons of invalid reductions when reducing calls to intrinsics since only arguments to intrinsics can be of the metadata type. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D113532	2021-11-11 18:42:16 -08:00
Michael Kruse	c15f930e96	[llvm-reduce] Introduce operands-skip pass. Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance: ``` %baseptr = alloca i32 %arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom store i32 42, i32* %arrayidx ``` might be reducible to ``` %baseptr = alloca i32 %arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom store i32 42, i32* %baseptr ``` Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check. In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013. Possible future extensions: * Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility. * If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set. Recommit after resolving conflict with D112651 and reusing shouldReduceOperand from D113532. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D111818	2021-11-11 20:16:34 -06:00
Michael Kruse	ed7b37155b	Revert "[llvm-reduce] Introduce operands-skip pass." This reverts commit `fa4210a9a0`. It causes compile failures, presumably because conflicting with another patch landed after I checked locally.	2021-11-11 19:25:39 -06:00
Michael Kruse	fa4210a9a0	[llvm-reduce] Introduce operands-skip pass. Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance: ``` %baseptr = alloca i32 %arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom store i32 42, i32* %arrayidx ``` might be reducible to ``` %baseptr = alloca i32 %arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom store i32 42, i32* %baseptr ``` Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check. In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013. Possible future extensions: * Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility. * If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D111818	2021-11-11 18:54:01 -06:00
Nico Weber	e23c6cc54e	[aarch64/mac] Correctly disassemble @TLVPPAGE(OFF) relocs `llvm-otool -tV foo.o` and `llvm-objdump --macho -d foo.o` would previously fail on object files containing @TLVPPAGE or @TLVPPAGEOFF relocs. Move llvm-objdump-specific test from llvm/test/MC/AArch64/arm64-tls-modifiers-darwin.s to new llvm/test/tools/llvm-objdump/MachO/disassemble-arm64-tlv-modifers.test and put test for this fix to that new file. Fixes PR52356. Differential Revision: https://reviews.llvm.org/D112843	2021-11-10 10:41:18 -05:00
Esme-Yi	ab97ffb96a	Reland [XCOFF][yaml2obj] support for the auxiliary file header. Summary: Fix the build failure on MSVC by making the `T` and `U` of the function 'T llvm::Optional<T>::getValueOr<llvm::yaml::Hex32>(U &&) const &' the same. Differential Revision: https://reviews.llvm.org/D111487	2021-11-10 07:23:56 +00:00
David Blaikie	58b1b6414b	llvm-dwarfdump: Lookup type units when prettyprinting types This handles DWARFv4 and DWARFv5 type units, but not Split DWARF type units. That'll come in a follow-up patch.	2021-11-09 16:58:22 -08:00
Gulfem Savrun Yeniceri	126e7611c7	[compiler-rt] Fix diagnostic in InstrProfError This patch fixes some issues introduced in https://reviews.llvm.org/D108942: 1) Remove the default label to fix the bots that use -Werror,-Wcovered-switch-default 2) Modify the malformed test to fix the bots that are built without zlib support 3) Modify some error messages in malformed profiles	2021-11-09 20:30:03 +00:00
Dwight Guth	16c3db8def	[llvm-reduce] Fix invalid reduction in basic-blocks delta pass Previously, if the basic-blocks delta pass tried to remove a basic block that was the last basic block in a function that did not have external or weak linkage, the resulting IR would become invalid. Since removing the last basic block in a function is effectively identical to removing the function body itself, we check explicitly for this case and if we detect it, we run the same logic as in ReduceFunctionBodies.cpp Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D113486	2021-11-09 10:43:38 -08:00
Dwight Guth	fbfd327fdf	[llvm-reduce] Add flag to start at finer granularity Sometimes if llvm-reduce is interrupted in the middle of a delta pass on a large file, it can take quite some time for the tool to start actually doing new work if it is restarted again on the partially-reduced file. A lot of time ends up being spent testing large chunks when these large chunks are very unlikely to actually pass the interestingness test. In cases like this, the tool will complete faster if the starting granularity is reduced to a finer amount. Thus, we introduce a command line flag that automatically divides the chunks into smaller subsets a fixed, user-specified number of times prior to beginning the core loop. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D112651	2021-11-09 10:14:08 -08:00
Fangrui Song	5f1e509579	[llvm-objdump] -p: Dump PE header for PE/COFF For a trivial DLL built with `clang --target=x86_64-windows -O2 -c a.c; lld-link -subsystem:console -dll a.o -out:a.dll`, `objdump -p` vs `llvm-objdump -p`: ``` -a.dll: file format pei-x86-64 - +a.dll: file format coff-x86-64 Characteristics 0x2022 executable large address aware @@ -57,4 +56,4 @@ Entry d 0000000000000000 00000000 Delay Import Directory Entry e 0000000000000000 00000000 CLR Runtime Header Entry f 0000000000000000 00000000 Reserved - +Export Table: ``` For a Linux image (`vmlinuz-5.10.76-gentoo-r1`) built with `CONFIG_EFI_STUB=y` ``` -vmlinuz-5.10.76-gentoo-r1: file format pei-x86-64 - -Characteristics 0x20e +vmlinuz-5.10.76-gentoo-r1: file format coff-x86-64 +Characteristics 0x206 executable line numbers stripped - symbols stripped debugging information removed Time/Date Wed Dec 31 16:00:00 1969 @@ -55,10 +53,4 @@ Entry d 0000000000000000 00000000 Delay Import Directory Entry e 0000000000000000 00000000 CLR Runtime Header Entry f 0000000000000000 00000000 Reserved - - -PE File Base Relocations (interpreted .reloc section contents) - -Virtual Address: 000037ca Chunk size 10 (0xa) Number of fixups 1 - reloc 0 offset 0 [37ca] ABSOLUTE - +Export Table: ``` `symbols stripped` looks like a GNU objdump problem. Reviewed By: jhenderson, alexander-shaposhnikov Differential Revision: https://reviews.llvm.org/D113356	2021-11-09 10:08:41 -08:00
Gulfem Savrun Yeniceri	ee88b8d63e	[compiler-rt] Add more diagnostic to InstrProfError If profile data is malformed for any kind of reason, we generate an error that only reports "malformed instrumentation profile data" without any further information. This patch extends InstrProfError class to receive an optional error message argument, so that we can do better error reporting. Differential Revision: https://reviews.llvm.org/D108942	2021-11-09 18:04:12 +00:00
Alexey Lapshin	c8ae08987d	[llvm-dwarfdump] dump link to the immediate parent. It is often useful to know which die is the parent of the current die. This patch adds information about parent offset into the dump: 0x0000000b: DW_TAG_compile_unit DW_AT_producer ("by_hand") 0x00000014: DW_TAG_base_type (0x0000000b) <<<<<<<<<<<<<< DW_AT_name ("int") Now it is easy to see which die is the parent of the current die. This patch makes that behaviour to be default. We can make it to be opt-in if neccessary. This functionality differs from already existed "--show-parents" in that sence that parent information is shown for all dies and only link to the immediate parent is shown. Differential Revision: https://reviews.llvm.org/D113406	2021-11-09 14:14:06 +03:00
Simon Pilgrim	32a4a883f6	Revert rGe1eec7601b6988b35ae3cdc8d67cf3cf4e1361dd "[XCOFF][yaml2obj] support for the auxiliary file header." This is failing on MSVC builds: https://lab.llvm.org/buildbot/#/builders/86/builds/23436	2021-11-09 11:02:13 +00:00
Esme-Yi	e1eec7601b	[XCOFF][yaml2obj] support for the auxiliary file header. Summary: This patch adds yaml2obj supporting for the auxiliary file header of XCOFF. Reviewed By: DiggerLin, jhenderson Differential Revision: https://reviews.llvm.org/D111487	2021-11-09 09:48:40 +00:00
Paul Robinson	38be8f4057	Add llvm-tli-checker A new tool that compares TargetLibraryInfo's opinion of the availability of library function calls against the functions actually exported by a specified set of libraries. Can be helpful in verifying the correctness of TLI for a given target, and avoid mishaps such as had to be addressed in D107509 and `94b4598d`. The tool currently supports ELF object files only, although it's unlikely to be hard to add support for other formats. Re-commits `62dd488` with changes to use pre-generated objects, as not all bots have ld.lld available. Differential Revision: https://reviews.llvm.org/D111358	2021-11-08 16:29:28 -08:00
Paul Robinson	1297c21406	Revert "Add llvm-tli-checker" Not all bots have ld.lld available. This reverts commit `62dd488164`.	2021-11-08 15:48:29 -08:00
Paul Robinson	62dd488164	Add llvm-tli-checker A new tool that compares TargetLibraryInfo's opinion of the availability of library function calls against the functions actually exported by a specified set of libraries. Can be helpful in verifying the correctness of TLI for a given target, and avoid mishaps such as had to be addressed in D107509 and `94b4598d`. The tool currently supports ELF object files only, although it's unlikely to be hard to add support for other formats. Differential Revision: https://reviews.llvm.org/D111358	2021-11-08 14:59:13 -08:00
Adrian Prantl	8bd8dd16e2	Extend obj2yaml to optionally preserve raw __LINKEDIT/__DATA segments. I am planning to upstream MachOObjectFile code to support Darwin chained fixups. In order to test the new parser features we need a way to produce correct (and incorrect) chained fixups. Right now the only tool that can produce them is the Darwin linker. To avoid having to check in binary files, this patch allows obj2yaml to print a hexdump of the raw LINKEDIT and DATA segment, which both allows to bootstrap the parser and enables us to easily create malformed inputs to test error handling in the parser. This patch adds two new options to obj2yaml: -raw-data-segment -raw-linkedit-segment Differential Revision: https://reviews.llvm.org/D113234	2021-11-08 11:30:12 -08:00
Zarko Todorovski	c4396b77ae	[LLVM][llvm-cfi] Inclusive language: replace uses of blacklist with ignorelist Replace the description and file names for this argument. As far as I understand this is a positional argument and I don't believe this changes breaks any existing interfaces. Reviewed By: hctim, MaskRay Differential Revision: https://reviews.llvm.org/D113316	2021-11-08 10:05:52 -05:00
Esme-Yi	9b6f264d2b	[XCOFF][llvm-readobj] improve the relocation output. Summary: 1. implemented the unexpanded relocations output. 2. modified the expanded output format to align. Reviewed By: shchenz, jhenderson Differential Revision: https://reviews.llvm.org/D111700	2021-11-08 03:15:52 +00:00
David Blaikie	0a5c26f2ef	DebugInfo: Simplified Template Names: drop unneeded space in arrays Matching a recent clang change I've made, now 'int[3]' is formatted without the space between the type and array bound. This commit updates libDebugInfoDWARF/llvm-dwarfdump to match that formatting.	2021-11-05 22:50:57 -07:00
wlei	5bf191a381	[llvm-profgen] Fix index out of bounds error while using ip.advance Previously we assume there're some non-executing sections at the bottom of the text section so that we won't hit the array's bound. But on BOLTed binary, it turned out .bolt section is at the bottom of text section which can be profiled, then it crash llvm-profgen. This change try to fix it. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D113238	2021-11-05 18:38:40 -07:00
David Blaikie	f57d0e2726	DWARF Simplified Template Names: Narrow down the handling for operator overloads Actually we can, for now, remove the explicit "operator" handling entirely - since clang currently won't try to flag any of these as rebuildable. That seems like a reasonable state for now, but it could be narrowed down to only apply to conversion operators, most likely - but would need more nuance for op> and op>> since they would be incorrectly flagged as already having their template arguments (due to the trailing '>').	2021-11-05 15:41:56 -07:00
Fangrui Song	26a8ceba3e	[llvm-readobj] Display DT_RELRSZ/DT_RELRENT as " (bytes)" to match RELSZ/RELENT. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D113206	2021-11-05 10:02:49 -07:00
gbreynoo	ced9287c2d	[llvm-objdump] Fix the Assertion failure when providing invalid --debug-vars or --dwarf values As seen in https://bugs.llvm.org/show_bug.cgi?id=52213 llvm-objdump asserts if either the --debug-vars or the --dwarf options are provided with invalid values. As suggested, this fix adds use of a default value to these options and errors when given bad input. Differential Revision: https://reviews.llvm.org/D112183	2021-11-04 11:01:32 +00:00
wlei	138202a8c3	[llvm-profgen] Warn on invalid range and show warning summary Two things in this diff: 1) Warn on the invalid range, currently three types of checking, see the detailed message in the code. 2) In some situation, llvm-profgen gives lots of warnings on the truncated stacks which is noisy. This change provides a switch to `--show-detailed-warning` to skip the warnings. Alternatively, we use a summary for those warning and show the percentage of cases with those issues. Example of warning summary. ``` warning: 0.05%(1120/2428958) cases with issue: Profile context truncated due to missing probe for call instruction. warning: 0.00%(2/178637) cases with issue: Range does not belong to any functions, likely from external function. ``` Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D111902	2021-11-02 19:55:55 -07:00
Hongtao Yu	d0eb472f33	[llvm-profdata] Print out section flags for FunctionMetadata section As titled. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D113064	2021-11-02 17:59:22 -07:00
Arthur Eubanks	f54a8759f0	[llvm-reduce] Reduce more GlobalValue properties Reviewed By: hans Differential Revision: https://reviews.llvm.org/D112885	2021-11-02 08:47:41 -07:00
Arthur Eubanks	80ba72b07b	[llvm-reduce] Reduce some GlobalObject properties Specifically, the section and the alignment. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D112884	2021-11-02 08:47:32 -07:00
Frederic Cambus	650311737e	[llvm-readobj] Add support for reading OpenBSD ELF core notes. Notes generated in OpenBSD core files provide additional information about the kernel state and CPU registers. These notes are described in core.5, which can be viewed here: https://man.openbsd.org/core.5 Differential Revision: https://reviews.llvm.org/D111966	2021-11-02 10:18:54 +01:00
Markus Lavin	fd41738e2c	Recommit "[llvm-reduce] Add MIR support" (Second try. Need to link against CodeGen and MC libs.) The llvm-reduce tool has been extended to operate on MIR (import, clone and export). Current limitation is that only a single machine function is supported. A single reducer pass that operates on machine instructions (while on SSA-form) has been added. Additional MIR specific reducer passes can be added later as needed. Differential Revision: https://reviews.llvm.org/D110527	2021-11-02 10:16:42 +01:00
Markus Lavin	aee7f3384b	Revert "[llvm-reduce] Add MIR support" This reverts commit `bc2773cb1b`. Broke the clang-ppc64le-linux-multistage build. Reverting while I investigate.	2021-11-02 09:41:02 +01:00
Markus Lavin	bc2773cb1b	[llvm-reduce] Add MIR support The llvm-reduce tool has been extended to operate on MIR (import, clone and export). Current limitation is that only a single machine function is supported. A single reducer pass that operates on machine instructions (while on SSA-form) has been added. Additional MIR specific reducer passes can be added later as needed. Differential Revision: https://reviews.llvm.org/D110527	2021-11-02 09:14:56 +01:00
wlei	3f3103c6a9	[llvm-profgen] Fill zero count for all function ranges Allow filling zero count for all the function ranges even there is no samples hitting that function. Add a switch for this. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D112858	2021-11-01 09:57:05 -07:00
Esme-Yi	81441cf44c	[XCOFF] [llvm-readobj] replace tests using binary as input with tests generated by yaml2obj. Summary: Because yaml2obj supports basic transforming for XCOFF, some of the binary inputs used in the tests of llvm-readobj can be replaced with yaml files. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D111699	2021-11-01 08:43:32 +00:00
wlei	f5537643b8	[llvm-profgen] Update total samples by accumulating all its body samples Like probe-based profile, the total samples is the sum of all its body samples. This patch fix it by a post-processing update for the line-number based profile. Tested it on our internal services, results showed no performance change. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D112672	2021-10-29 10:36:57 -07:00
wlei	2f8196db92	[llvm-profgen] Fix bug of populating profile symbol list Previous implementation of populating profile symbol list is wrong, it only included the profiled symbols. Actually it should use all symbols, here this switches to use the symbols from debug info. Also turned the flag off by default. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D111824	2021-10-29 09:59:12 -07:00
Arthur Eubanks	177a703710	[llvm-reduce] Actually skip invalid candidates in operands-to-args This was checked while counting but not actually when doing the reduction, resulting in crashes. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D112766	2021-10-29 09:14:18 -07:00
David Blaikie	b65f24a74c	llvm-dwarfdump --verify: Don't diagnose functions in different sections as overlapping Functions in different sections (common in object files - inline functions, -ffunction-sections, etc) can't overlap, so factor in the section when diagnosing overlapping address ranges. This removes a major false-positive when running llvm-dwarfdump on unlinked code.	2021-10-28 17:13:57 -07:00
Hongtao Yu	259e4c5658	[CSSPGO] Trim cold base profiles for the CS preinliner. Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D112489	2021-10-27 22:50:27 -07:00
Djordje Todorovic	40c2bdf6d1	[llvm-locstats] Move the test from D110621 into test/llvm-locstats/ dir	2021-10-27 17:36:19 +02:00
djtodoro	30a3652b6a	[llvm-locstats] Report a warning if overflow was detected by llvm-dwarfdump Catch that llvm-dwarfdump detected an overflow in statistics. Differential Revision: https://reviews.llvm.org/D110621	2021-10-27 14:35:29 +02:00
Nico Weber	3c0cf7e1a9	Unbreak code_signature_lc.test on macOS after `911be05743`	2021-10-26 21:05:48 -04:00
Daniel Rodríguez Troitiño	911be05743	[test][objcopy] Replace GNU sed extension with BSD compatible syntax. GNU sed offers the `,+4d` to delete the line a next four lines, but BSD sed doesn't seem to support it (at least in macOS 10.15, but seems to do in my 11.6 version). Replace the usage of the extension with the equivalent syntax that works both in BSD and GNU sed. I don't have a macOS 10.15 to check, but this works in both my macOS 11.6 and Linux machines. Differential Revision: https://reviews.llvm.org/D112583	2021-10-26 17:35:56 -07:00
David Blaikie	3ac709b6ce	llvm-dwarfdump --verify: Exit non-zero on simplified template name rebuilding failures	2021-10-26 15:57:16 -07:00
Nuri Amari	a299b24712	Regenerate LC_CODE_SIGNATURE during llvm-objcopy operations Context: This is a second attempt at introducing signature regeneration to llvm-objcopy. In this diff: https://reviews.llvm.org/D109840, a script was introduced to test the validity of a code signature. In this diff: https://reviews.llvm.org/D109803 (now reverted), an effort was made to extract the signature generation behavior out of LLD into a common location for use in llvm-objcopy. In this diff: https://reviews.llvm.org/D109972 it was decided that there was no appropriate common location and that a small amount of duplication to bring signature generation to llvm-objcopy would be better. This diff introduces this duplication. Summary Prior to this change, if a LC_CODE_SIGNATURE load command was included in the binary passed to llvm-objcopy, the command and associated section were simply copied and included verbatim in the new binary. If rest of the binary was modified at all, this results in an invalid Mach-O file. This change regenerates the signature rather than copying it. The code_signature_lc.test test was modified to include the yaml representation of a small signed MachO executable in order to effectively test the signature generation. Reviewed By: alexander-shaposhnikov, #lld-macho Differential Revision: https://reviews.llvm.org/D111164	2021-10-26 14:51:13 -07:00
zhijian	c2d2fb5093	address an test error on window os , exclude the test llvm/test/tools/llvm-readobj/XCOFF/xcoff-auxiliary-header.test from windows OS. http://45.33.8.238/win/47662/step_11.txt for https://reviews.llvm.org/D82549	2021-10-26 13:56:52 -04:00
zhijian	158083f0de	[AIX][XCOFF] parsing xcoff object file auxiliary header Summary: The patch supports parsing the xcoff object file auxiliary header with llvm-readobj with option "auxiliary-headers" the format of auxiliary header as https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/filesreference/XCOFF.html#XCOFF__fyovh386shar Reviewers: James Henderson, Jason Liu, Hubert Tong, Esme yi, Sean Fertile. Differential Revision: https://reviews.llvm.org/D82549	2021-10-26 10:40:25 -04:00
wlei	a5f411b7f8	[llvm-profgen] Allow unsymbolized profile as perf input This change allows the unsymbolized profile as input. The unsymbolized profile is created by `llvm-profgen` with `--skip-symbolization` and it's after the sample aggregation but before symbolization , so it has much small file size. It can be used for sample merging and trimming, also is useful for debugging or adding test cases. A switch `--unsymbolized-profile=file-patch` is added for this. Format of unsymbolized profile: ``` [context stack1] # If it's a CS profile number of entries in RangeCounter from_1-to_1:count_1 from_2-to_2:count_2 ...... from_n-to_n:count_n number of entries in BranchCounter src_1->dst_1:count_1 src_2->dst_2:count_2 ...... src_n->dst_n:count_n [context stack2] ...... ``` Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D111750	2021-10-25 23:58:08 -07:00
Jack Anderson	d7733f8422	[DebugInfo] Expand ability to load 2-byte addresses in dwarf sections Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors. The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111953	2021-10-21 17:31:00 -07:00
Wenlei He	e8c245dcd3	[llvm-profgen] Skip duplication factor outside of body sample computation We incorrectly use duplication factor for total samples even though we already accumulate samples instead of taking MAX. It causes profile to have bloated total samples for functions with loop unrolled or vectorized. The change fix the issue for total sample, head sample and call target samples. Differential Revision: https://reviews.llvm.org/D112042	2021-10-19 23:10:45 -07:00
Arthur Eubanks	9660563950	[llvm-reduce] Add reduction passes to reduce operands to undef/1/0 Having non-undef constants in a final llvm-reduce output is nicer than having undefs. This splits the existing reduce-operands pass into three, one which does the same as the current pass of reducing to undef, and two more to reduce to the constant 1 and the constant 0. Do not reduce to undef if the operand is a ConstantData, and do not reduce 0s to 1s. Reducing GEP operands very frequently causes invalid IR (since types may not match up if we index differently into a struct), so don't touch GEPs. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D111765	2021-10-19 15:25:21 -07:00
Simon Pilgrim	0bb32b1b21	[X86][SLM] Fix BitTest+Set uops + port usage Both ports are required for BitTest ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures and what Intel AoM / Agner reports as well.	2021-10-17 18:13:15 +01:00
Simon Pilgrim	5ed5df4802	[X86][SLM] Fix uops for PCMPISTR/PCMPISTR instructions Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
Simon Pilgrim	680afaaa5d	[X86][SLM] Fix uops for PCLMULQDQ Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
Simon Pilgrim	498c7236bc	[X86][SLM] +1uop for PSHUFBrm xmm Extra 1uop for folded pshufb ops, based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.	2021-10-17 18:13:14 +01:00
djtodoro	c450e47a8c	[llvm-dwarfdump] Fix unsigned overflow when calculating stats This fixes https://bugs.llvm.org/show_bug.cgi?id=51652. The idea is to bump all the stat fields to 64-bit wide unsigned integers. I've confirmed this resolves the use case for chromium. Differential Revision: https://reviews.llvm.org/D109217	2021-10-15 12:15:58 +02:00
Craig Topper	3ff9cc01f2	[X86] Use CMOVNS for abs instead of CMOVGE. CMOVGE reads SF and OF. CMOVNS only reads SF. This matches with other recent changes to use a single flag where possible. It also matches gcc codegen. I believe this technically changes whether the conditioanl move happens on INT_MIN, but for INT_MIN both registers are the same so it doesn't matter. Differential Revision: https://reviews.llvm.org/D111826	2021-10-14 12:28:28 -07:00
Kai Nacke	b050564d3e	[AIX] Ignore case when comparing output from od POSIX does not define the exact output from od tool. While most implementations use lower case characters in hex output, the z/OS USS implementation uses upper case characters. To avoid LIT failures, the FileCheck option to ignore the case must be used when checking hex bytes. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D111427	2021-10-14 13:51:02 -04:00
Wenlei He	a316343e19	[llvm-profgen] Allow generating AutoFDO profile from CSSPGO binary Add `-use-dwarf-correlation` switch to allow llvm-profgen to generate AutoFDO profile for binaries built with CSSPGO (pseudo-probe). Differential Revision: https://reviews.llvm.org/D111776	2021-10-14 09:11:56 -07:00
wlei	30ca33eab0	[llvm-profgen] Ignore the whole trace with the leading external branch The first LBR entry can be an external branch, we should ignore the whole trace. ``` 7f7448e889e4 0x7f7448e889e4/0x7f7448e88826/P/-/-/1 0x7f7448e8899f/0x7f7448e889d8/P/-/-/4 ... ``` Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D111749	2021-10-13 16:52:29 -07:00
Michael Kruse	dd71b65ca8	[llvm-reduce] Introduce operands-to-args pass. Instead of setting operands to undef as the "operands" pass does, convert the operands to a function argument. This avoids having to introduce undef values into the IR which have some unpredictability during optimizations. For instance, define void @func() { entry: %val = add i32 32, 21 store i32 %val, i32* null ret void } is reduced to define void @func(i32 %val) { entry: %val1 = add i32 32, 21 store i32 %val, i32* null ret void } (note that the instruction %val is renamed to %val1 when printing the IR to avoid ambiguity; ideally %val1 would be removed by dce or the instruction reduction pass) Any call to @func is replaced with a call to the function with the new signature and filled with undef. This is not ideal for IPA passes, but those out-of-scope for now. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D111503	2021-10-13 09:54:03 -05:00
Arthur Eubanks	337cf0a5ab	[llc] Support -time-trace in llc Mostly copied from opt.cpp. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D111466	2021-10-11 10:16:46 -07:00
Esme-Yi	a00ff71668	[XCOFF] Improve error message context. Summary: This patch improves the error message context of the XCOFF interfaces by providing more details. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D110320	2021-10-11 02:52:20 +00:00
David Green	adec922361	[AArch64] Make -mcpu=generic schedule for an in-order core We would like to start pushing -mcpu=generic towards enabling the set of features that improves performance for some CPUs, without hurting any others. A blend of the performance options hopefully beneficial to all CPUs. The largest part of that is enabling in-order scheduling using the Cortex-A55 schedule model. This is similar to the Arm backend change from `eecb353d0e` which made -mcpu=generic perform in-order scheduling using the cortex-a8 schedule model. The idea is that in-order cpu's require the most help in instruction scheduling, whereas out-of-order cpus can for the most part out-of-order schedule around different codegen. Our benchmarking suggests that hypothesis holds. When running on an in-order core this improved performance by 3.8% geomean on a set of DSP workloads, 2% geomean on some other embedded benchmark and between 1% and 1.8% on a set of singlecore and multicore workloads, all running on a Cortex-A55 cluster. On an out-of-order cpu the results are a lot more noisy but show flat performance or an improvement. On the set of DSP and embedded benchmarks, run on a Cortex-A78 there was a very noisy 1% speed improvement. Using the most detailed results I could find, SPEC2006 runs on a Neoverse N1 show a small increase in instruction count (+0.127%), but a decrease in cycle counts (-0.155%, on average). The instruction count is very low noise, the cycle count is more noisy with a 0.15% decrease not being significant. SPEC2k17 shows a small decrease (-0.2%) in instruction count leading to a -0.296% decrease in cycle count. These results are within noise margins but tend to show a small improvement in general. When specifying an Apple target, clang will set "-target-cpu apple-a7" on the command line, so should not be affected by this change when running from clang. This also doesn't enable more runtime unrolling like -mcpu=cortex-a55 does, only changing the schedule used. A lot of existing tests have updated. This is a summary of the important differences: - Most changes are the same instructions in a different order. - Sometimes this leads to very minor inefficiencies, such as requiring an extra mov to move variables into r0/v0 for the return value of a test function. - misched-fusion.ll was no longer fusing the pairs of instructions it should, as per D110561. I've changed the schedule used in the test for now. - neon-mla-mls.ll now uses "mul; sub" as opposed to "neg; mla" due to the different latencies. This seems fine to me. - Some SVE tests do not always remove movprfx where they did before due to different register allocation giving different destructive forms. - The tests argument-blocks-array-of-struct.ll and arm64-windows-calls.ll produce two LDR where they previously produced an LDP due to store-pair-suppress kicking in. - arm64-ldp.ll and arm64-neon-copy.ll are missing pre/postinc on LPD. - Some tests such as arm64-neon-mul-div.ll and ragreedy-local-interval-cost.ll have more, less or just different spilling. - In aarch64_generated_funcs.ll.generated.expected one part of the function is no longer outlined. Interestingly if I switch this to use any other scheduled even less is outlined. Some of these are expected to happen, such as differences in outlining or register spilling. There will be places where these result in worse codegen, places where they are better, with the SPEC instruction counts suggesting it is not a decrease overall, on average. Differential Revision: https://reviews.llvm.org/D110830	2021-10-09 15:58:31 +01:00
Qiu Chaofan	573531fb1f	Fix typo of colon to semicolon in lit tests	2021-10-09 10:03:50 +08:00
Abhina Sreeskantharajan	7d7b139042	[test] Use host platform specific error message substitution This patch modifies the testcase to use error substitution so it will pass on all platforms. Reviewed By: fanbo-meng, muiez Differential Revision: https://reviews.llvm.org/D111320	2021-10-08 13:52:31 -04:00
wlei	b1a45c62f0	[llvm-profgen] Ignore branch count against outline function For some transformations like hot-cold split or coro split, it can outline its part of function ranges. Since sample loader is the early stage of backend and no split happens at that time, compiler can't recognize those function, so in llvm-profgen we should attribute the sample to the original function. This is already done for the body range samples since we use the symbols from dwarf which is created before the split. But for branch samples, the call from master function to its outlined function is actually not a call to the original function, we shouldn't add head/callsie samples for it. So instead of dwarf symbol, we use the symbols from symbol table and ignore those functions with special suffixes(like `.cold` ,`.resume`) for accumulating the callsite/head samples. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110864	2021-10-07 14:03:34 -07:00
gbreynoo	9072183cb6	[llvm-objdump] Fix --prefix and --prefix-strip In the command guide --prefix and --prefix-strip is used in the form --prefix=<prefix> however currently it is used in the form --prefix <prefix>. This change fixes these options to match the command guide. Differential Revision: https://reviews.llvm.org/D110551	2021-10-07 15:53:45 +01:00
wlei	16516f8925	[llvm-profgen] Support symbol list for accurate profile Differential Revision: https://reviews.llvm.org/D110859	2021-10-06 11:41:39 -07:00
Petr Hosek	24c615fa6b	[InstrProfData] Bump the raw profile version to 8 This is to account for the change that made CountersPtr in __profd_ relative which landed in `a1532ed275`. That change hasn't updated the raw profile version, and while the profile layout stayed the same, profiles generated by tip-of-tree LLVM are incompatible with 13.x tooling. Differential Revision: https://reviews.llvm.org/D111123	2021-10-05 09:57:56 -07:00
gbhyamso	02895eede1	[llvm-cxxfilt][NFC] Fix test for running in Windows cmd The test llvm\test\tools\llvm-cxxfilt\delimiters.test started failling when run from cmd.exe on Windows after D110986 which added a unicode character (⦙) to it. Piping the unicode character in cmd.exe causes it to be converted to a '?'. That causes the test to fail because the llvm-cxxfilt output becomes Foo?Bar rather than the expected Foo⦙Bar. Redirect the echo output to and from a temporary file to get around this problem. It's not entirely clear what the root cause is, but two separate downstream builders are tripping up on this, so we are landing the work around for the time being. Differential Revision: https://reviews.llvm.org/D111072	2021-10-05 12:10:06 +01:00
wlei	31a5cb3292	[llvm-profgen] Filter out invalid debug line Differential Revision: https://reviews.llvm.org/D110081	2021-10-04 19:09:06 -07:00
wlei	46cf7d75d9	[llvm-profgen] Add duplication factor for line-number based profile This change adds duplication factor multiplier while accumulating body samples for line-number based profile. The body sample count will be `duplication-factor * count`. Base discriminator and duplication factor is decoded from the raw discriminator, this requires some refactor works. Differential Revision: https://reviews.llvm.org/D109934	2021-10-04 19:08:55 -07:00
Simon Pilgrim	7cae0daee6	[X86][Atom] Fix BSR/BSF uops + port usage Both ports are required for BitScan ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.	2021-10-02 19:09:44 +01:00
Simon Pilgrim	8e7f6039fa	[X86] Atom SSE shift-by-variable take 2uops/3uops not 1uop Based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.	2021-10-02 12:28:41 +01:00
Tomasz Miąsko	f33274c7bf	[llvm-cxxfilt] Replace isalnum with isAlnum from StringExtras D104366 introduced a new llvm-cxxfilt test with non-ASCII characters, which caused a failure on llvm-clang-x86_64-expensive-checks-win builder, with a stack trace suggesting issue in a call to isalnum. The argument to isalnum should be either EOF or a value that is representable in the type unsigned char. The llvm-cxxfilt does not perform a cast from char to unsigned char before the call, so the value might be out of valid range. Replace the call to isalnum with isAlnum from StringExtras, which takes a char as the argument. This also makes the check independent of the current locale. Differential Revision: https://reviews.llvm.org/D110986	2021-10-02 08:54:04 +02:00
zhijian	5b44c716ee	[AIX]implement the --syms and using "symbol index and qualname" for --sym --symbol--description for llvm-objdump for xcoff Summary: for xcoff : implement the getSymbolFlag and getSymbolType() for option --syms. llvm-objdump --sym , if the symbol is label, print the containing section for the symbol too. when using llvm-objdump --sym --symbol--description, print the symbol index and qualname for symbol. for example: --symbol-description 00000000000000c0 l .text (csect: (idx: 2) .foov[PR]) (idx: 3) .foov and without --symbol-description 00000000000000c0 l .text (csect: .foov) .foov Reviewers: James Henderson,Esme Yi Differential Revision: https://reviews.llvm.org/D109452	2021-10-01 12:37:51 -04:00
Florian Hahn	57fbb9ed0e	[llvm-reduce] Skip updating calls where OldF isn't the called fn. When replacing function calls, skip call instructions where the old function is not the called function, but e.g. the old function is passed as an argument. This fixes a crash due to trying to construct invalid IR for the test case. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D109759	2021-10-01 10:52:48 +01:00
Fangrui Song	8971b99c83	[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling convention or the vector calling convention. See https://github.com/riscv/riscv-elf-psabi-doc/pull/190 Differential Revision: https://reviews.llvm.org/D107949	2021-09-29 16:56:52 -07:00
Wael Yehia	8b8da01d88	Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace." This reverts commit `a60405cf03`.	2021-09-29 19:43:35 +00:00
Michael Kruse	d9562a8e45	[llvm-reduce] Reduce metadata references. The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting. Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle. This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes. Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind). Reviewed By: aeubanks, swamulism Differential Revision: https://reviews.llvm.org/D110534	2021-09-29 11:25:35 -05:00
David Green	e9adcbde31	[AArch64] Model Cortex-A55 Q register NEON instructions Cortex-A55 has 2 64bit NEON vector units, meaning a 128bit instruction requires taking both units (and can only be issued as the first instruction in a dual issue pair). This patch models that by splitting the WriteV SchedWrite into two - the WriteVd that reads/writes only 64bit operands, and the WriteVq that read/writes 128bit registers. The A55 schedule then uses this distinction to model the WriteVq as taking both resource units, and starting a Schedule Group and WriteVd as taking one as before. I believe this is more correct, even if it does not lead to much better performance. Differential Revision: https://reviews.llvm.org/D108766	2021-09-29 16:55:31 +01:00
Wael Yehia	a60405cf03	[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace. Reviewed by: steven_wu, fhahn, tejohnson Differential Revision: https://reviews.llvm.org/D110075	2021-09-29 12:17:53 +00:00
Igor Kudrin	7b424b9333	[llvm-objcopy] Rename relocation sections together with their targets. As for now, llvm-objcopy renames only sections that are specified explicitly in --rename-section, while GNU objcopy keeps names of relocation sections in sync with their targets. For example: > readelf -S test.o ... [ 1] .foo PROGBITS [ 2] .rela.foo RELA > objcopy --rename-section .foo=.bar test.o gnu.o > readelf -S gnu.o ... [ 1] .bar PROGBITS [ 2] .rela.bar RELA > llvm-objcopy --rename-section .foo=.bar test.o llvm.o > readelf -S llvm.o ... [ 1] .bar PROGBITS [ 2] .rela.foo RELA This patch makes llvm-objcopy to match the behavior of GNU objcopy better. Differential Revision: https://reviews.llvm.org/D110352	2021-09-29 16:36:37 +07:00
wlei	a03cf331e1	[llvm-profgen] Strip context to support non-CS profile generation for hybrid sample Differential Revision: https://reviews.llvm.org/D109769	2021-09-28 12:20:23 -07:00
Leonard Chan	b9f547e8e5	[llvm][profile] Add padding after binary IDs Some tests with binary IDs would fail with error: no profile can be merged. This is because raw profiles could have unaligned headers when emitting binary IDs. This means padding should be emitted after binary IDs are emitted to ensure everything else is aligned. This patch adds padding after each binary ID to ensure the next binary ID size is 8-byte aligned. This also adds extra checks to ensure we aren't reading corrupted data when printing binary IDs. Differential Revision: https://reviews.llvm.org/D110365	2021-09-28 11:50:50 -07:00
Fangrui Song	74a47e54be	[llvm-objdump] Fix -R display and support ET_EXEC * Add a newline before `DYNAMIC RELOCATION RECORDS` (see D101796) * Add the missing `OFFSET TYPE VALUE` line * Align columns Note: llvm-readobj/ELFDumper.cpp `loadDynamicTable` has sophisticated PT_DYNAMIC code which is unavailable in llvm-objdump. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D110595	2021-09-28 09:58:27 -07:00
Alex Richardson	547e5e4ae6	[update_llc_test_checks.py] Fix MIPS ASM regex for functions with EH On MIPS, functions with exception handling code emits an additional temporary label at the start of the function (due to UseAssignmentForEHBegin): _Z8do_catchv: # @_Z8do_catchv .Ltmp3: .set .Lfunc_begin0, .Ltmp3 .cfi_startproc .cfi_personality 128, DW.ref.__gxx_personality_v0 .cfi_lsda 0, .Lexception0 .frame $c11,48,$c17 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .set nomacro .set noat # %bb.0: # %entry The `[^:]*` regex was terminating the search after .Ltmp<N>: and therefore not detecting functions with exception handling. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D100027	2021-09-28 17:57:36 +01:00
Alex Richardson	ee3109b044	[update_llc_test_checks] Baseline test for D100027 Show that we fail to generate CHECK lines for MIPS64 functions with EH. Differential Revision: https://reviews.llvm.org/D110408	2021-09-28 17:57:36 +01:00
Jozef Lawrynowicz	6cfb4d46ba	[llvm-readobj] Support dumping of MSP430 ELF attributes The MSP430 ABI supports build attributes for specifying the ISA, code model, data model and enum size in ELF object files. Differential Revision: https://reviews.llvm.org/D107969	2021-09-28 00:56:11 +03:00
modimo	ce6ed64a69	[llvm-profdata] Extend support of --topn to sample profiles Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110449	2021-09-24 16:42:46 -07:00
Wei Mi	80865f7579	Add "REQUIRES: zlib" in forward-compatible.test since it handles compressed file.	2021-09-24 15:35:07 -07:00
Wei Mi	e8b376547b	Fixed a bug in https://reviews.llvm.org/rG8eb617d719bdc6a4ed7773925d2421b9bbdd4b7a . For compressed profile when reading an unknown section, the data reader pointer adjustment was incorrect. This patch fixed that.	2021-09-24 15:23:45 -07:00
Jonas Devlieghere	d0649320bf	[dsymutil] Update union-fwd-decl.test for Windows Remove path separators from CHECK-lines in union-fwd-decl.test	2021-09-24 15:07:22 -07:00
David Blaikie	9911af4b91	WIP: Verify -gsimple-template-names=mangled values Clang will encode names that should be able to be simplified as "_STNname\|<template, args>" (eg: "_STNt1\|<int>") - this verification mode will detect these names, decode them, create the original name ("t1<int>") and the simple name ("t1") - letting the simple name run through the usual rebuilding logic - then compare the two sources of the full name - the rebuilt and the _STN encoding. This helps ensure that -gsimple-template-names is lossless.	2021-09-24 14:28:18 -07:00
Jonas Devlieghere	62d6ff5e9e	[dsymutil] Track incompleteness across unions When determining the incompleteness of a DIE based on its children, make sure we propagate it across union types. See test case for an example. Without this patch we never emit the definition of Container_ivars. Differential revision: https://reviews.llvm.org/D110443	2021-09-24 14:26:37 -07:00
wlei	1422fa5fab	[llvm-profgen] Unify output format of different unsymbolized profiles Differential Revision: https://reviews.llvm.org/D110080	2021-09-24 14:18:00 -07:00
wlei	28277e9b48	[AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code In order to be consistent with compiler that interprets zero count as unexecuted(cold), this change reports zero-value count for unexecuted part of function code. For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value. After all ranges are merged and converted into disjoint ranges, the remaining zero count will indicates the unexecuted(cold) part of the function. This change also extends the current `findDisjointRanges` method which now can support adding zero-value range. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D109713	2021-09-24 14:15:05 -07:00
wlei	d5f2013004	[AutoFDO][llvm-profgen] Profile generation for LBR(non-CS) sample This patch introduces non-CS AutoFDO profile generation into LLVM. The profile is supposed to be well consumed by compiler using `-fprofile-sample-use=[profile]`. After range and branch counters are extracted from the LBR sample, here we go through each addresses for symbolization, create FunctionSamples and populate its sub fields like TotalSamples, BodySamples and HeadSamples etc. For inlined code, as we need to map back to original code, so we always add body samples to the leaf frame's function sample. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D109551	2021-09-24 13:55:34 -07:00
wlei	a7cdcf25c1	[llvm-profgen] Ignore invalid perf line in LBR record Similar to https://reviews.llvm.org/D109637, there is a whole invalid line of message in perfscript. ``` warning: Invalid address in LBR record at line 14118674: Processed 14138923 events and lost 1 chunks! warning: Invalid address in LBR record at line 14118676: Check IO/CPU overload! ``` This only happened for LBR only perfscript, hybridperfscript have a check of " 0x" to make sure it's the LBR perf line. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110424	2021-09-24 13:44:57 -07:00
Simon Pilgrim	dade83c02a	[X86][SLM] Fix ADDQ/SUBQ/CMPEQQ throughput to account for running on either port. Testing on a SLM box suggests these can run on either port, but the throughput is 4cy on either (inc MMX versions). Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-24 10:06:14 +01:00
Wenlei He	81c249784f	[llvm-profgen] Use hot threshold for context merging and trimming Without preinliner, we need to tune down the cold count cutoff to merge/trim more context to limit profile size for large components. However it doesn't make sense for cold threshold to be higher than hot threshold, so we now change to use hot threshold as merging/trimming cut off instead. Differential Revision: https://reviews.llvm.org/D110212	2021-09-22 15:01:51 -07:00
Hongtao Yu	734f4d832c	[llvm-profgen] An option to dump disasm of specified symbols For large app, dumping disasm of the whole program can be slow and result in gianant output. Adding a switch to dump specific symbols only. Reviewed By: wlei Differential Revision: https://reviews.llvm.org/D110079	2021-09-22 10:32:59 -07:00
Hongtao Yu	d9b511d8e8	[CSSPGO] Set PseudoProbeInserter as a default pass. Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110209	2021-09-22 09:09:48 -07:00
Sebastian Neubauer	ecd5145c27	[Utils] Replace llc with cat for tests Make the update_llc_test_checks script test independant of llc behavior by using cat with static files to simulate llc output. This allows changing llc without breaking the script test case. The update script is executed in a temporary directory, so the llc-generated assembly files are copied there. %T is deprecated, but it allows copying a file with a predictable filename. Differential Revision: https://reviews.llvm.org/D110143	2021-09-22 10:10:35 +02:00
David Blaikie	49c519a848	DebugInfo: Rebuild decltype(nullptr) as 'std::nullptr_t' Now that Clang's been changed to render nullptr types/template parameters as 'std::nullptr_t' do the same thing down here. (Clang commit: `131e878664` )	2021-09-21 11:37:30 -07:00
Paul Robinson	fa822a2ee5	[DebugInfo] Add test for dumping DW_AT_defaulted	2021-09-20 16:43:53 -04:00
Alex Richardson	817e23d481	[update_mir_test_checks.py] Use -NEXT FileCheck directories Previously the script emitted output using plain CHECK directives. This can result in a test passing even if there are some instructions between CHECK directives that should have been removed. It also makes debugging tests that have the output in a different order more difficult since FileCheck can match with a later line and then complain about the "wrong" directive not being found. This will cause quite large diffs when updating existing tests, but I'm not sure we need an opt-in flag here. Depends on D109765 (pre-commit tests) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D109767	2021-09-20 12:55:56 +01:00
Alex Richardson	7b68c0725d	pre-commit test for D109767 Differential Revision: https://reviews.llvm.org/D109765	2021-09-20 12:55:56 +01:00
David Blaikie	cb42bb3550	llvm-dwarfdump: pretty type printing: print fully qualified names in function type parameter types	2021-09-19 18:49:15 -07:00
David Blaikie	606ea0dd2a	llvm-dwarfdump: support for type printing "decltype(nullptr)" as "nullptr_t" This should probably be rendered as "std::nullptr_t" but for now clang uses the unqualified name (which is ambiguous with possible user defined name in the global namespace), so match that here.	2021-09-19 17:33:56 -07:00
David Blaikie	11e0b79b05	llvm-dwarfdump: Don't print even an empty string when a type is unprintable	2021-09-19 17:03:10 -07:00
David Blaikie	5bfe5207ef	llvm-dwarfdump: Pretty print names qualified/with scopes	2021-09-19 16:36:01 -07:00
David Blaikie	372e2c24b6	llvm-dwarfdump: Pretty printing types including a space between const and parenthesized references/pointers to arrays	2021-09-19 13:32:53 -07:00
David Blaikie	f09ca5c646	DWARFDie: Improve type printing for function and array types - with qualifiers (cv/reference) and pointers to them	2021-09-19 12:59:31 -07:00
Simon Pilgrim	f855ef2601	[X86][Atom] Fix FP uops + port usage Both ports are required in most cases. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well. Noticed while trying to improve fp costs for vectorization via the D103695 helper script.	2021-09-19 20:39:20 +01:00
David Blaikie	2ca637c976	llvm-dwarfdump: Refactor type pretty printing tests Move most type tests to a pre-generated assembly file to make it easier to add more weird cases without having to hand craft more DWARF. Move the novel array types that aren't reachable via clang-generated DWARF to a separate file for easy maintenance.	2021-09-19 09:30:38 -07:00
Simon Pilgrim	cf8fac7d07	[X86][Atom] Specific uops for all IMUL/IDIV instructions Based off a mixture of llvm-exegesis captures (PR36895) and Intel AoM / Agner / InstLatX64 reports.	2021-09-19 16:58:52 +01:00
Simon Pilgrim	e381d8b243	[X86][Atom] Fix (U)COMISS/SD uops, latency and throughput Both ports are required, for reg and mem variants - we can also use the WriteFComX class directly and remove the unnecessary InstRW overrides. Matches what Intel AoM / Agner / InstLatX64 report as well.	2021-09-19 12:44:44 +01:00
Samuel	f18c0739b3	[llvm-reduce] Add reduce operands pass Add reduction to set operands to default values Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D108903	2021-09-17 12:32:15 -07:00
Simon Pilgrim	5ebe95e256	[X86][Atom] Fix integer shuffles uops, latency and throughput The MMX pack/unpck shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only). The SSE pslldq/psrldq shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only). The SSE pshufb shuffles use 4uops (+1 load). Noticed the pslldq/psrldq issue while trying to improve reduction costs via the D103695 helper script, and fixed the others while reviewing. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-17 12:11:54 +01:00
Wenlei He	446e21623c	[llvm-profgen] Use context-sensitive byte size cost for preinliner decisions by default Turn on `use-context-cost-for-preinliner` to use context-sensitive byte size cost for preinliner decisions by default. This is a more accurate proxy of inline cost than profile size. We tested on our large workload that it delivers measureable CPU improvement. Differential Revision: https://reviews.llvm.org/D109893	2021-09-16 10:36:12 -07:00
serge-sans-paille	85f2ae57f7	Be more flexible on the storage type allowed for llvm::Any::TypeId::Id This is a follow-up to `2c42a73d6c`.	2021-09-16 11:01:53 +02:00
Arthur Eubanks	5d78e33ce5	[test] Move some llvm-extract tests into the proper directory	2021-09-15 15:42:04 -07:00
serge-sans-paille	2c42a73d6c	Add extra check for llvm::Any::TypeId visibility This check should ensure we don't reproduce the problem fixed by `02df443d28` More accurately, it checks every llvm::Any::TypeId symbol in libLLVM-x.so and make sure they have weak linkage and are not local to the library, which would lead to duplicate definition if another weak version of the symbol is defined in another linked library. Differential Revision: https://reviews.llvm.org/D109252	2021-09-15 08:32:55 +02:00
Esme-Yi	945df8bc4c	[obj2yaml][XCOFF] Dump sections Summary: This patch implements parsing sections for obj2yaml on AIX. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D98003	2021-09-15 05:16:33 +00:00
Hongtao Yu	0057c7185d	[CSSPGO][llvm-profgen] Truncate stack samples with invalid return address. Invalid frame addresses exist in call stack samples due to bad unwinding. This could happen to frame-pointer-based unwinding and the callee functions that do not have the frame pointer chain set up. It isn't common when the program is built with the frame pointer omission disabled, but can still happen with third-party static libs built with frame pointer omitted. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D109638	2021-09-14 21:56:22 -07:00
Martin Storsjö	63784b9a75	[llvm-readobj] [COFF] Resolve relocations pointing at section symbols for arm64 too This syncs parts from the x86 implementation to the ARMWinEH implementation. Currently, neither of the compilers targeting COFF/arm64 (MSVC, LLVM) produce such relocations, but LLVM might after a later patch. Differential Revision: https://reviews.llvm.org/D109650	2021-09-14 11:04:46 +03:00
Martin Storsjö	197084fcee	[llvm-readobj] [COFF] Try to resolve symbols in unwind info on x86 This is the same as we do on arm64 already for the MSVC style label symbols, but also handle the way GCC produces it - with all relocations pointing at the .text section symbol, with various offsets. Differential Revision: https://reviews.llvm.org/D109649	2021-09-14 11:04:46 +03:00
Esme-Yi	b98c3e957f	[yaml2obj][XCOFF] add the SectionIndex field for symbol. Summary: Add the SectionIndex field for symbol. 1: a symbol can reference a section by SectionName or SectionIndex. 2: a symbol can reference a section by both SectionName and SectionIndex. 3: if both Section and SectionIndex are specified, but the two values refer to different sections, an error will be reported. 4: an invalid SectionIndex is allowed. 5: if a symbol references a non-existent section by SectionName, an error will be reported. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D109566	2021-09-14 06:18:03 +00:00
Esme-Yi	909f3d7380	[yaml2obj][XCOFF] customize the string table Summary: The patch adds support for yaml2obj customizing the string table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D107421	2021-09-13 09:24:38 +00:00
Simon Pilgrim	65ad09da0e	[X86][SLM] Fix DIVPD/DIVPS/RCPPS/RSQRTPS/SQRTPD/SQRTPS/DPPD/DPPS uops, latency and throughput The packed variants of the instructions had been modelled as the same as the scalar variants. Reported during a run of llvm-exegesis on a cheap SLM box and matches what Agner / InstLatX64 report as well.	2021-09-13 08:36:43 +01:00
Simon Pilgrim	df975e4590	[X86][SLM] Fix PSAD/MPSAD uops, latency and throughput Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-11 11:44:09 +01:00
Simon Pilgrim	484944ac3b	[X86][SLM] Fix HADD/HSUB uops, latency and throughput Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-11 11:44:09 +01:00
Keith Smiley	e972e49b11	[llvm-cov] Add error for invalid -path-equivalence format Differential Revision: https://reviews.llvm.org/D109042	2021-09-10 18:34:37 -07:00
Sam Clegg	e4b2f3054a	[WebAssembly][libObject] Avoid re-use of Section object during parsing The re-use of this struct across iterations of the loop was causing fields (specifically Name) to be incorrectly shared between multiple sections. Differential Revision: https://reviews.llvm.org/D108984	2021-09-10 09:30:50 -04:00
Serge Bazanski	231bfaab31	[Lanai] fix MC / objdump D78776 removed is{Call,Branch,UnconditionalBranch} guards in objdump before calling MCInstrAnalysis::evaluateBranch. This is fine for other architectures as they gracefully handle evaluateBranch being called on non-branches. However, the Lanai MCInstrAnalysis implementation didn't and that change caused it to crash. This inserts the same guards back into Lanai's evaluateBranch implementation and adds a smoke test that exercises `llc \| objdump` so this kind of regression is hopefully caught next time. Reviewed By: jpienaar, MaskRay Differential Revision: https://reviews.llvm.org/D107593	2021-09-10 10:46:13 +00:00
Alfonso Sánchez-Beato	b25ab4f313	[llvm-objcopy][COFF] Fix test for debug dir presence If the number of directories was 6 (equal to the DEBUG_DIRECTORY index), patchDebugDirectory() was run even though the debug directory is actually the 7th entry. Use <= in the comparison to fix that. This fixes https://llvm.org/PR51243 Differential Revision: https://reviews.llvm.org/D106940 Reviewed by: jhenderson	2021-09-10 09:57:18 +01:00
Alfonso Sánchez-Beato	b33fd31772	[yaml2obj][COFF] Allow variable number of directories Allow variable number of directories, as allowed by the specification. NumberOfRvaAndSize will default to 16 if not specified, as in the past. Reviewed by: jhenderson Differential Revision: https://reviews.llvm.org/D108825	2021-09-09 11:16:56 +01:00
Wei Mi	8eb617d719	[SampleFDO] Allow forward compatibility when adding a new section for extbinary format. Currently when we add a new section in the profile format and generate a profile containing the new section, older compiler which reads the new profile will issue an error. The forward incompatibility can cause unnecessary churn when extending the profile. This patch removes the incompatibility when adding a new section for extbinary format. Differential Revision: https://reviews.llvm.org/D109398	2021-09-07 19:38:43 -07:00
Maksim Panchenko	6300e4ac58	[llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations Print relocations interleaved with disassembled instructions for executables with relocatable sections, e.g. those built with "-Wl,-q". Differential Revision: https://reviews.llvm.org/D109016	2021-09-07 11:24:24 -07:00
Roman Lebedev	e030f808ec	[Exegesis] Native clusterization: sub-partition by sched class id Currently native clusterization simply groups all benchmarks by the opcode of key instruction, but that is suboptimal in certain cases, e.g. where we can already tell that the particular instructions already resolve into different sched classes.	2021-09-07 17:54:37 +03:00
Roman Lebedev	b3b9b297a0	[NFC][exegesis] Add test for the following patch	2021-09-07 17:54:36 +03:00
Simon Pilgrim	056b409ceb	[llvm-exegesis][x86] Limit llvm-exegesis analysis tests to x86_64 triple hosts Attempting to fix an issue with test failures on arm m1 apple macintoshes reported on D109353	2021-09-07 14:35:52 +01:00
Simon Pilgrim	6a9e2764f6	[llvm-exegesis] Analysis tests should run even without libpfm (PR51687) Move inverse_throughput, latency and uops to sub-directories (like we already do for lbr), which require libpfm, so we can relax the lit limits for analysis tests in the x86 root directory. Differential Revision: https://reviews.llvm.org/D109353	2021-09-07 13:58:05 +01:00
Andrew Litteken	bd4b1b5f6d	[IRSim] Adding support for recognizing branch similarity The current IRSimilarityIdentifier does not try to find similarity across blocks, this patch provides a mechanism to compare two branches against one another, to find similarity across basic blocks, rather than just within them. This adds a step in the similarity identification process that labels all of the basic blocks so that we can identify the relative branching locations. Within an IRSimilarityCandidate we use these relative locations to determine whether if the branching to other relative locations in the same region is the same between branches. If they are, we consider them similar. We do not consider the relative location of the branch if the target branch is outside of the region. In this case, both branches must exit to a location outside the region, but the exact relative location does not matter. Reviewers: paquette, yroux Differential Revision: https://reviews.llvm.org/D106989	2021-09-06 11:55:38 -07:00
Simon Pilgrim	2005ae15a6	[X86][SLM] WriteVecIMul instructions only take 1uop (REAPPLIED) The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop. I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD. But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.	2021-09-04 15:03:56 +01:00
Simon Pilgrim	ac51d69208	Revert rG994da657076900f5ad7fe593c3b5e5f89ab3d53d "[X86][SLM] WriteVecIMul instructions only take 1uop" This changed some codegen tests that I forgot about in my rebase, I'll recommit shortly with a fix.	2021-09-04 13:39:10 +01:00
Simon Pilgrim	994da65707	[X86][SLM] WriteVecIMul instructions only take 1uop The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop. I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD. But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	c6371020a8	[X86][SLM] RMW instructions don't require an extra uop For RMW instructions, the load and store hold the MEC for an extra cycle, but within the same single uop. This is alluded to in the Intel AOM: "The MEC also owns the MEC RSV, which is responsible for scheduling of all loads and stores. Load and store instructions go through addresses generation phase in program order to avoid on-the-fly memory ordering later in the pipeline. Therefore, an unknown address will stall younger memory instructions." Noticed while trying to get a cheap SLM test box up and running with llvm-exegesis - RMW arithmetic is always 1uop - and matches what Agner / InstLatX64 report as well.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	da965a77d5	[X86][SLM] Fix MUL uops, latency and throughput These were all set to the same best case mul i32 values (which seems to be the only version of MUL that SLM actually performs well with). Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.	2021-09-04 13:21:34 +01:00
Simon Pilgrim	7d062d2c47	[X86][Atom] MUL/DIV instructions require both ports, not either. Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM.	2021-09-04 11:58:09 +01:00

... 2 3 4 5 6 ...

5669 Commits