llvm-project

Commit Graph

Author	SHA1	Message	Date
Adrian Prantl	072f5180f2	Improve error handling in llvm-dwarfdump. Without this patch we're only showing a generic error message derived from the error code to the end user. rdar://79378794 Differential Revision: https://reviews.llvm.org/D104483	2021-06-23 10:44:13 -07:00
Fangrui Song	011b502ce8	[llvm-objcopy][MachO] Fix namespace style issues	2021-06-23 00:31:52 -07:00
Hongtao Yu	5c8659801a	[CSSPGO][llvm-profgen] Handle return to external transition. In a callback case, a return from internal code, say A, to external runtime can happen. The external runtime can then call back to another internal routine, say B. Making an artificial branch that looks like a return from A to B can confuse the unwinder to treat the instruction before B as the call instruction. Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D104546	2021-06-22 16:24:59 -07:00
Bill Wendling	46db43240f	[llvm-diff] Explicitly check ConstantArrays Global initializers may be ConstantArrays. They need to be checked explicitly, because different-yet-still-equivalent type names may be used for each, and/or a GEP instruction may appear in one.	2021-06-22 12:23:38 -07:00
Bill Wendling	ab6002871d	[llvm-diff] Add support for diffing the callbr instruction The only wrinkle is that we can't process the "blockaddress" arguments of the callbr until the blocks have been equated. So we force them to be "unified" before checking. This was left out when the callbr instruction was added. Differential Revision: https://reviews.llvm.org/D104606	2021-06-22 12:23:37 -07:00
Patrick Holland	d03736455c	[MCA] [In-order pipeline] Fix for 0 latency instruction causing assertion to fail. 0 latency instructions now get processed and retired properly within the in-order pipeline. Had to fix a bug within TimelineView.cpp as well that would show up when a 0 latency instruction was the first instruction in the source. Differential Revision: https://reviews.llvm.org/D104675	2021-06-22 10:18:39 -07:00
Fangrui Song	3accff2553	[llvm-objcopy] Fix some namespace style issues https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D104693	2021-06-22 09:19:48 -07:00
Bill Wendling	dd1b121c99	[llvm-diff] Constify APIs so that there aren't conflicts Some APIs work with const variables while others don't. This can cause conflicts when calling one from the other. This is NFC. Differential Revision: https://reviews.llvm.org/D104719	2021-06-22 09:17:04 -07:00
Martin Storsjö	703b0ed8e2	[ADT] Add StringRef consume_front_lower and consume_back_lower These serve as a convenient combination of consume_front/back and startswith_lower/endswith_lower, consistent with other existing case insensitive methods named <operation>_lower. Differential Revision: https://reviews.llvm.org/D104218	2021-06-22 12:38:08 +03:00
Fangrui Song	3f873e9b51	[llvm-objcopy] Internalize some symbols	2021-06-21 23:49:25 -07:00
Fangrui Song	f14e6e4451	[llvm-objcopy] Delete empty namespace. NFC	2021-06-21 23:44:07 -07:00
Rong Xu	8c68eb8306	[SampleFDO] Make FSDiscriminator flag part of function parameters Add a parameter of IsFSDiscriminator to function getBaseDiscriminatorFromDiscriminator(). This function currently checks the internal flag of --enable-fs-discriminator. This is not good because we might change the default value of the internal flag. Note that we have a default parameter. This is just because create_afdo_tool has a call-site to it. I will remove the default parameter in a later patch. Differential Revision: https://reviews.llvm.org/D104584	2021-06-21 14:37:45 -07:00
Langston Barrett	a240358833	[llvm-reduce] Don't delete arguments of intrinsics The argument reduction pass shouldn't remove arguments of intrinsics, because the resulting module is ill-formed, and so inherently uninteresting. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D103129	2021-06-21 12:43:58 -07:00
Fangrui Song	ea23c38d06	[llvm-profdata] Allow omission of -o for --text output This makes it more convenient to get a text format profile. Add an error for printing non-text format output to a terminal for instrumentation profile. (It cannot be portably tested. For sample profile, raw_fd_ostream is hidden deeply so it's inconvenient to add a diagnostic.) Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D104600	2021-06-21 12:01:57 -07:00
Fangrui Song	8ea2a58a2e	[llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style. And add `error:` or `warning:` to CHECK lines wherever appropriate.	2021-06-19 14:54:25 -07:00
Fangrui Song	0f558db742	[llvm-profdata] Delete unneeded empty output filename check	2021-06-19 12:20:45 -07:00
Fangrui Song	59d90fe817	Simplify some typedef struct	2021-06-19 11:36:44 -07:00
Hongtao Yu	bd52495518	[CSSPGO] Undoing the concept of dangling pseudo probe As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen. I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104477	2021-06-18 15:14:11 -07:00
Hongtao Yu	fb19aa0c74	[CSSPGO][llvm-profgen] Fix an issue in findDisjointRanges We were using 0 as an indicator of invalid offset when computing disjoint ranges. In reality, 0 can be an valid code offset which stands for the first function in .text section. I'm using UINT64_MAX as an invalid code offset instead. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104497	2021-06-18 14:38:48 -07:00
Hongtao Yu	8c2c97287e	[CSSPGO][llvm-profgen] Ignore LBR records after interrupt transition If we have seen an inwards transition from external code to internal code, but not a following outwards transition, the inwards transition is likely due to interrupt which is usually unpaired. Ignore current and subsequent entries since they are likely from an unrelated pre-interrupt context. LBR records from different interrupt context are unrelated and they should not be mixed together. Currenlty the OS does this for task-scheduling interrupt but not for all interrupts. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D104276	2021-06-18 12:13:53 -07:00
Hongtao Yu	c60f1d5d98	[CSSPGO] Fix an invalid hash table reference issue in the CS preinliner. We were using a `StringMap` object to store all profiles to be emitted. The object is basically an unordered hash table, therefore updating it in the process of trasvering it may cause issue since the underlying bucket array could change. I'm also moving the `csspgo-preinliner` switch around so that no context tri will be constructed (by the constructor of `CSPreInliner`) when the switch is off. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104267	2021-06-18 11:54:23 -07:00
Heejin Ahn	1d891d44f3	[WebAssembly] Rename event to tag We recently decided to change 'event' to 'tag', and 'event section' to 'tag section', out of the rationale that the section contains a generalized tag that references a type, which may be used for something other than exceptions, and the name 'event' can be confusing in the web context. See - https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130 - https://github.com/WebAssembly/exception-handling/pull/161 Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D104423	2021-06-17 20:34:19 -07:00
Min-Yih Hsu	c29555342c	[MCA] Anchoring the vtable of CustomBehaviour Put the dtor of mca::CustomBehaviour into the cpp file to avoid undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared library. Differential Revision: https://reviews.llvm.org/D104401	2021-06-16 12:43:58 -07:00
Fangrui Song	d619cf5ac5	[llvm-objcopy][MachO] Copy LC_LINKER_OPTIMIZATION_HINT This fixes `error: unsupported load command (cmd=0x2e)`	2021-06-16 12:09:50 -07:00
Hongtao Yu	cef9b96b01	[CSSPGO] Report zero-count probe in profile instead of dangling probes. Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits: 1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode. 2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader. 3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes. 4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples. 5. Better readability. 6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler. Note that the current patch does include any work for #3. There will be follow-up changes. For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of LBRProfileSection is reduced by 35%. For #4, I have seen general counts quality for SPEC2017 is improved by 10%. Reviewed By: wenlei, wlei, wmi Differential Revision: https://reviews.llvm.org/D104129	2021-06-16 11:45:29 -07:00
Fangrui Song	1de18ad8d7	[llvm-objcopy] Make ihex writer similar to binary writer There is no need to differentiate whether `UseSegments` is true or false. Unifying the cases makes the behavior closer to BinaryWriter. This improves compatibility with objcopy because SHF_ALLOC sections not in a PT_LOAD will not be skipped. Such cases are usually erroneous input, though. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D104186	2021-06-16 10:08:20 -07:00
Patrick Holland	ef16c8eaa5	Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca". The original change was pushed in main as commit `f7a23ecece`. It was then reverted by commit `a04f01bab2` because it caused linker failures on buildbots that don't build the AMDGPU target. -- Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. More details are available in the original commit log message (`f7a23ecece`). Differential Revision: https://reviews.llvm.org/D104149	2021-06-16 16:54:48 +01:00
James Henderson	b9ce8ea454	[obj2yaml] Address D104035 review comments Accidentally missed from commit `5c1639fe06`. Differential Revision: https://reviews.llvm.org/D104035	2021-06-16 15:01:54 +01:00
James Henderson	5c1639fe06	[yaml2obj][obj2yaml] Support custom ELF section header string table name This patch adds support for a new field in the FileHeader, which states the name to use for the section header string table. This also allows combining the string table with another string table in the object, e.g. the symbol name string table. The field is optional. By default, .shstrtab will continue to be used. This partially fixes https://bugs.llvm.org/show_bug.cgi?id=50506. Reviewed by: Higuoxing Differential Revision: https://reviews.llvm.org/D104035	2021-06-16 10:02:23 +01:00
Andrea Di Biagio	a04f01bab2	Revert "[MCA] Adding the CustomBehaviour class to llvm-mca" This reverts commit `f7a23ecece`. It appears to breaks buildbots that don't build the AMDGPU backend.	2021-06-15 21:41:36 +01:00
Patrick Holland	f7a23ecece	[MCA] Adding the CustomBehaviour class to llvm-mca Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. Implementation details: llvm-mca does its best to extract relevant register, resource, and memory information from every MCInst when lowering them to an mca::Instruction. It then uses this information to detect dependencies and simulate stalls within the pipeline. For some instructions, the information that gets captured within the mca::Instruction is not enough for mca to simulate them properly. In these cases, there are two main possibilities: 1. The instruction has a dependency that isn’t detected by mca. 2. mca is incorrectly enforcing a dependency that shouldn’t exist. For the rest of this discussion, I will be focusing on (1), but I have put some thought into (2) and I may revisit it in the future. So we have an instruction that has dependencies that aren’t picked up by mca. The basic idea for both pipelines in mca is that when an instruction wants to be dispatched, we first check for register hazards and then we check for resource hazards. This is where CB is injected. If no register or resource hazards have been detected, we make a call to CustomBehaviour::checkCustomHazard() to give the target specific CB the chance to detect and enforce any custom dependencies. The return value for checkCustomHazaard() is an unsigned int representing the (minimum) number of cycles that the instruction needs to stall for. It’s fine to underestimate this value because when StallCycles gets down to 0, we’ll end up checking for all the hazards again before the instruction is actually dispatched. However, it’s important not to overestimate the value and the more accurate your estimate is, the more efficient mca’s execution can be. In general, for checkCustomHazard() to be able to detect these custom dependencies, it needs information about the current instruction and also all of the instructions that are still executing within the pipeline. The mca pipeline uses mca::Instruction rather than MCInst and the current information encoded within each mca::Instruction isn’t sufficient for my use cases. I had to add a few extra attributes to the mca::Instruction class and have them get set by the MCInst during instruction building. For example, the current mca::Instruction doesn’t know its opcode, and it also doesn’t know anything about its immediate operands (both of which I had to add to the class). With information about the current instruction, a list of all currently executing instructions, and some target specific objects (MCSubtargetInfo and MCInstrInfo which the base CB class has references to), developers should be able to detect and enforce most custom dependencies within checkCustomHazard. If you need more information than is present in the mca::Instruction, feel free to add attributes to that class and have them set during the lowering sequence from MCInst. Fortunately, in the in-order pipeline, it’s very convenient for us to pass these arguments to checkCustomHazard. The hazard checking is taken care of within InOrderIssueStage::canExecute(). This function takes a const InstRef as a parameter (representing the instruction that currently wants to be dispatched) and the InOrderIssueStage class maintains a SmallVector<InstRef, 4> which holds all of the currently executing instructions. For the out-of-order pipeline, it’s a bit trickier to get the list of executing instructions and this is why I have held off on implementing it myself. This is the main topic I will bring up when I eventually make a post to discuss and ask for feedback. CB is a base class where targets implement their own derived classes. If a target specific CB does not exist (or we pass in the -disable-cb flag), the base class is used. This base class trivially returns 0 from its checkCustomHazard() implementation (meaning that the current instruction needs to stall for 0 cycles aka no hazard is detected). For this reason, targets or users who choose not to use CB shouldn’t see any negative impacts to accuracy or performance (in comparison to pre-patch llvm-mca). Differential Revision: https://reviews.llvm.org/D104149	2021-06-15 21:30:48 +01:00
Simon Pilgrim	941188e965	[llvm-exegesis] Fix X86LbrCounter destructor to correctly unmap memory and not double-close fd (PR50620) As was reported on PR50620, the X86LbrCounter destructor was double-closing the filedescriptor and not unmapping the buffer. Differential Revision: https://reviews.llvm.org/D104201	2021-06-15 14:24:35 +01:00
wlei	863184dd69	[CSSPGO] Aggregation by the last K context frames for cold profiles This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one. This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D104131	2021-06-14 10:33:43 -07:00
David Blaikie	02c718301b	llvm-objcopy: fix section size truncation/extension when dumping sections Since this only comes up with inputs containing sections at least 4GB large (I guess I could use a bzero section or something, so the input file doesn't have to be 4GB, but even then the output file would have to be 4GB, right?) I've skipped testing this. If there's a nice way to test this without needing 4GB inputs or output files. The subtlety here is demonstrated by this code: struct t { operator uint64_t(); }; static_assert(std::is_same_v<int, decltype(std::declval<bool>() ? 0 : std::declval<t>())>); static_assert(std::is_same_v<uint64_t, decltype(std::declval<bool>() ? 0 : std::declval<uint64_t>())>); Because of this difference, the original source code was getting an int type (truncating the actual size) and then extending it again, resulting in bogus values (I haven't thought through this hard enough to explain why the resulting value was 0xffff... - sign extension, possible UB, but in any case it's the wrong answer - in this particular case I was looking at that resulted in a size so large that we couldn't open a file large enough to write to and ended up with a rather vague: error: 'file_name.o': Invalid argument	2021-06-12 19:00:10 -07:00
Ian McIntyre	5899278758	[llvm-objcopy] Exclude empty sections in IHexWriter output IHexWriter was evaluating a section's physical address when deciding if that section should be written to an output. This approach does not account for a zero-sized section that has the same physical address as a sized section. The behavior varies from GNU objcopy, and may result in a HEX file that does not include all program sections. The IHexWriter now excludes zero-sized sections when deciding what should be written to the output. This affects the contents of the writer's `Sections` collection; we will not try to insert multiple sections that could have the same physical address. The behavior seems consistent with GNU objcopy, which always excludes empty sections, no matter the address. The new test case evaluates the IHexWriter behavior when provided a variety of empty sections that overlap or append a filled section. See the input file's comments for more information. Given that test input, and the change to the IHexWriter, GNU objcopy and llvm-objcopy produce the same output. Reviewed By: jhenderson, MaskRay, evgeny777 Differential Revision: https://reviews.llvm.org/D101332	2021-06-12 12:23:07 -07:00
Alexander Shaposhnikov	0276cc742b	[llvm-objcopy][MachO] Do not strip symbols with the flag REFERENCED_DYNAMICALLY set Do not strip symbols having the flag REFERENCED_DYNAMICALLY set. Test plan: make check-all Differential revision: https://reviews.llvm.org/D104092	2021-06-11 16:34:59 -07:00
Andrew Litteken	8bc0eb4011	Revert "[IRSim] Adding basic implementation of llvm-sim." This reverts commit `f47d00c54b`.	2021-06-11 15:44:19 -05:00
Andrew Litteken	f47d00c54b	[IRSim] Adding basic implementation of llvm-sim. This is a similarity visualization tool that accepts a Module and passes it to the IRSimilarityIdentifier. The resulting SimilarityGroups are output in a JSON file. Tests are found in test/tools/llvm-sim and check for the file not found, a bad module, and that the JSON is created correctly. Reviewers: paquette, jroelofs, MaskRay Recommit of: `15645d044b` to fix linking errors. Differential Revision: https://reviews.llvm.org/D86974	2021-06-11 14:56:41 -05:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Simon Pilgrim	646e970d44	[llvm-stress] Fix dead code preventing us generating per-element vector selects This has been reported several times by the PVS Studio team as well as coming up in some static analysis. getRandom() % 1 always returns 0 so we never actually test this codepath, (git blame suggests this has always been like this) - given that we have plenty of other "getRandom() & 1" the typo is pretty obvious, and matches the intention in the comment above - with this change we generate a nice mixture of scalar/vector condition selects of vectors. I don't know llvm-stress that well - but I don't think we guarantee that the same seed value will always generate the same IR for later versions of the program - just that the same binary would. Differential Revision: https://reviews.llvm.org/D104022	2021-06-11 10:56:19 +01:00
Simon Pilgrim	d789ed11ea	Fix implicit dependency on <string> header. NFCI.	2021-06-11 10:24:14 +01:00
David Tenty	75d4f55d15	[AIX] Build libLTO as MODULE rather than SHARED On CMake versions greater that >= 3.16 on AIX, shared libraries are created as archives (which is the normal form for the platform). However plugins libraries which are passed directly to a executable, like libLTO to the linker, are usual build as plain `.so`, so this patch restores this behaviour for libLTO on AIX (and adjust the name if need be to account for the fact that llvm_add_library likes to force an empty name prefix on modules), so we end up with the expected libLTO.so Reviewed By: w2yehia Differential Revision: https://reviews.llvm.org/D103824	2021-06-10 12:08:59 -04:00
Sam Powell	5b5ab80e31	Reland "[llvm] llvm-tapi-diff" This is relanding commit `d1d36f7ad2` . This patch additionally addresses failures found in buildbots due to unstable build ordering & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 21:17:34 -07:00
Eric Astor	4b5317e937	[ms] [llvm-ml] Add support for INCLUDE environment variable Also adds support for the ML.exe command-line flag /X, which ignores the INCLUDE environment variable. This relands commit `c43f413b01` using lit's cross-platform `env` support. Differential Revision: https://reviews.llvm.org/D103989	2021-06-09 17:54:40 -04:00
Cyndy Ishida	e7b755ecb1	Revert "Reland "[llvm] llvm-tapi-diff"" This reverts commit `20126c9fd4`. The sorting fixes failed to have stable output on different platforms.	2021-06-09 13:48:09 -07:00
Cyndy Ishida	1899cb7d0e	Revert "[llvm-tapi-diff] Apply stable sorting to output" This reverts commit `90a26a41e9`. This failed to fix ubuntu failures.	2021-06-09 13:48:09 -07:00
Sam Powell	90a26a41e9	[llvm-tapi-diff] Apply stable sorting to output * For the output, the attributes within the target slice should be grouped by the input order, then sorted by value ordering. This is to fix current ubuntu buildbot inconsistences.	2021-06-09 13:09:47 -07:00
Eric Astor	68d0db0b6d	Revert "[ms] [llvm-ml] Add support for INCLUDE environment variable" This reverts commit `c43f413b01` due to Windows environment build breaks	2021-06-09 15:49:51 -04:00
Eric Astor	c43f413b01	[ms] [llvm-ml] Add support for INCLUDE environment variable Also adds support for the ML.exe command-line flag /X, which ignores the INCLUDE environment variable.	2021-06-09 15:25:26 -04:00
Sam Powell	20126c9fd4	Reland "[llvm] llvm-tapi-diff" This is relanding commit `d1d36f7ad2` . This patch additionally addresses failures found in buildbots & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 10:35:41 -07:00
Florian Hahn	e978f6bc97	[LTO] Support new PM in ThinLTOCodeGenerator. This patch adds initial support for using the new pass manager when doing ThinLTO via libLTO. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D102627	2021-06-09 10:05:14 +01:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit `211e584fa2`. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit `ea10a86984`. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Brendon Cahoon	ea10a86984	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
David Blaikie	c5d56fec50	NFC: .clang-tidy: Inherit configs from parents to improve maintainability In the interests of disabling misc-no-recursion across LLVM (this seems like a stylistic choice that is not consistent with LLVM's style/development approach) this NFC preliminary change adjusts all the .clang-tidy files to inherit from their parents as much as possible. This change specifically preserves all the quirks of the current configs in order to make it easier to review as NFC. I validatad the change is NFC as follows: for X in `cat ../files.txt`; do mkdir -p ../tmp/$(dirname $X) touch $(dirname $X)/blaikie.cpp clang-tidy -dump-config $(dirname $X)/blaikie.cpp > ../tmp/$(dirname $X)/after rm $(dirname $X)/blaikie.cpp done (similarly for the "before" state, without this patch applied) for X in `cat ../files.txt`; do echo $X diff \ ../tmp/$(dirname $X)/before \ <(cat ../tmp/$(dirname $X)/after \ \| sed -e "s/,readability-identifier-naming$.$,-readability-identifier-naming/\1/" \ \| sed -e "s/,-llvm-include-order$.$,llvm-include-order/\1/" \ \| sed -e "s/,-misc-no-recursion$.$,misc-no-recursion/\1/" \ \| sed -e "s/,-clang-diagnostic-\$.$,clang-diagnostic-\/\1/") done (using sed to strip some add/remove pairs to reduce the diff and make it easier to read) The resulting report is: .clang-tidy clang/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-readability-identifier-naming,-misc-no-recursion' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-misc-no-recursion' compiler-rt/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,-llvm-header-guard,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-llvm-header-guard' flang/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,llvm-,-llvm-include-order,misc-,-misc-no-recursion,-misc-unused-parameters,-misc-non-private-member-variables-in-classes' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-llvm-include-order,-misc-no-recursion' flang/include/flang/Lower/.clang-tidy flang/include/flang/Optimizer/.clang-tidy flang/lib/Lower/.clang-tidy flang/lib/Optimizer/.clang-tidy lld/.clang-tidy lldb/.clang-tidy llvm/tools/split-file/.clang-tidy mlir/.clang-tidy The `clang/.clang-tidy` change is a no-op, disabling an option that was never enabled. The compiler-rt and flang changes are no-op reorderings of the same flags. (side note, the .clang-tidy file in parallel-libs is broken and crashes clang-tidy because it uses "lowerCase" as the style instead of "lower_case" - so I'll deal with that separately) Differential Revision: https://reviews.llvm.org/D103842	2021-06-08 08:25:59 -07:00
jasonliu	8e84311a84	[XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing Add in the ability of parsing symbol table for 64 bit object. Reviewed By: jhenderson, DiggerLin Differential Revision: https://reviews.llvm.org/D85774	2021-06-07 17:24:13 +00:00
Simon Pilgrim	551a697c5c	xray-color-helper.cpp - add missing implicit cmath header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h (necessary for gcc builds but not MSVC)	2021-06-05 21:33:24 +01:00
Simon Pilgrim	6ff62d7e17	xray-color-helper.h - sort includes. NFCI.	2021-06-05 21:33:23 +01:00
Rong Xu	8d581857d7	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Cyndy Ishida	5337c7550d	Revert "[llvm] llvm-tapi-diff" This reverts commit `d1d36f7ad2`. Reverting this patch to investigate linux bot failures + fix with author offline	2021-06-03 21:10:51 -07:00
Wenlei He	aaa826fac1	[CSSPGO][llvm-profgen] Make extended binary the default output format Make extended binary the default output format for CSSPGO. This avoids having to pass flag every time when generating profile. It also matches llvm-profdata where binary profile is the default (should we switch to extbinary as default for llvm-profdata?). We plan to compress name table for context profile, which depends on the built-in compression of extbinary. Differential Revision: https://reviews.llvm.org/D103650	2021-06-03 17:58:16 -07:00
Sam Powell	d1d36f7ad2	[llvm] llvm-tapi-diff This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-03 11:38:00 -07:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
Kim-Anh Tran	de51c48ed3	[llvm-dwp] Add support for rnglists and loclists This patch updates llvm-dwp to include rnglists and loclists when parsing debug sections. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101894	2021-06-02 12:31:35 -07:00
Kim-Anh Tran	316da543af	[llvm-dwp] Add support for DWARFv5 type units ... This patch adds support for DWARFv5 type units: parsing from the .debug_info section, and writing index to the type unit index. Previously, the type units were part of the .debug_types section which is no longer used in DWARFv5. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101818	2021-06-02 12:24:08 -07:00
Kim-Anh Tran	6e2d3049d2	[llvm-dwp] Adding support for v5 index writing This patch adds general support for DWARFv5 index writing. In particular, this means only allowing inputs with one version, either DWARFv5 or DWARFv4. This patch adds the .debug_macro section as an example, but the DWARFv5 type support and loc and rangelists are still missing (and upcoming). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102315	2021-06-02 12:21:31 -07:00
Kim-Anh Tran	595b1683b7	[llvm-dwp] Skip type unit debug info sections This patch makes llvm-dwp skip debug info sections that may not be encoding a compile unit. In DWARF5, debug info sections are also used for type units. As in preparation to support type units, make llvm-dwp aware of other uses of debug info sections but skip them for now. The patch first records all .debug_info sections, then goes through them one by one and records the cu debug info section for writing the index unit, and copies that section to the final dwp output info section. If it's not a compile unit, skip. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102312	2021-06-02 11:48:10 -07:00
Rahman Lavaee	616ac1b961	[llvm-readobj] Print function names with `--bb-addr-map`. This patch uses the `getSymbolIndexForFunctionAddress` helper function to print function names for BB address map entries. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102900	2021-06-01 18:40:42 -07:00
gbreynoo	e60f147324	[llvm-dwarfdump][test] Add missing dedicated tests for some options This change adds tests specifically for --parent-recurse-depth, --quiet and -o. The test for -o found a typo in an error message which is also fixed in this change. Differential Revision: https://reviews.llvm.org/D103250	2021-06-01 14:57:00 +01:00
Andrea Di Biagio	9853d0db1e	[MCA][NFCI] Minor changes to InstrBuilder and Instruction. This is based on the assumption that most simulated instructions don't define more than one or two registers. This is true for example on x86, where most instruction definitions don't declare more than one register write. The default code region size has been increased from 8 to 16. This is based on the assumption that, for small microbenchmarks, the typical code snippet size is often less than 16 instructions. mca::Instruction now uses bitfields to pack flags. No functional change intended.	2021-05-31 17:05:13 +01:00
Alexey Lapshin	83cc4478a0	[llvm-objcopy][NFC] Refactor CopyConfig structure - remove lazy options processing. During reviewing D102277 it was decided to remove lazy options processing from llvm-objcopy CopyConfig structure. This patch transforms processing of ELF lazy options into the in-place processing. Differential Revision: https://reviews.llvm.org/D103260	2021-05-31 14:40:27 +03:00
Andrea Di Biagio	50770d8de5	[MCA] Refactor the InOrderIssueStage stage. NFCI Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile. Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent(). No functional change intended.	2021-05-27 22:28:04 +01:00
Simon Giesecke	5f2d4b23b4	Add --quiet option to llvm-gsymutil to suppress output of warnings. Differential Revision: https://reviews.llvm.org/D102829	2021-05-27 12:36:34 +00:00
Esme-Yi	d82f2a123f	[llvm-objdump] Print the DEBUG type under `--section-headers`. Summary: Under the option --section-headers, we can only print the section types of TEXT, DATA, and BSS for now. This patch adds the DEBUG type. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D102603	2021-05-27 04:53:14 +00:00
Rahman Lavaee	6505c63040	[llvm-readobj] Optimize printing stack sizes to linear time. Currently, each function name lookup is a linear iteration over all symbols defined in the object file which makes the total running time quadratic. This patch optimizes the function name lookup by populating an address to index map upon the first function name lookup which is used to lookup each function name in O(1). impact: For the clang binary built with `-fstack-size-section`, this improves the running time of `llvm-readobj --stack-size` from 7 minutes to 0.25 seconds. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D103072	2021-05-26 13:14:33 -07:00
Fangrui Song	73a1179535	[llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names In objdump, many targets support `-M no-aliases`. Instead of having a `-*-no-aliases` for each target when LLVM adds the support, it makes more sense to introduce objdump style `-M`. -riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D103004	2021-05-26 10:43:32 -07:00
Esme-Yi	bf809cd165	[NFC][object] Change the input parameter of the method isDebugSection. Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102601	2021-05-26 08:47:53 +00:00
Wenlei He	fa14fd30ce	[CSSPGO][llvm-profgen] Change default cold threshold for context merging llvm-profgen uses profile summary based cold threshold to merge and trim cold context profile. This is to strike a good balance between profile size and performance. We've been using 99.9% as the cutoff to save profile size without affecting performance. This change switch to use 99.9% instead of 99.9999% as default cold threshold cutoff for llvm-profgen. Redundant switch csprof-cold-thres is also removed and tests cleaned up. Differential Revision: https://reviews.llvm.org/D103071	2021-05-25 10:41:10 -07:00
Langston Barrett	472c009139	[llvm-reduce] Exit when input module is malformed The parseInputFile function returns an empty unique_ptr to signal an error, like when the input file doesn't exist, or is malformed. In this case, the tool should exit immediately rather than segfault by dereferencing the unique_ptr later. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102891	2021-05-25 10:01:12 -07:00
Roman Lebedev	78eaff2ef8	[llvm-exegesis] Loop unrolling for loop snippet repetitor mode I really needed this, like, factually, yesterday, when verifying dependency breaking idioms for AMD Zen 3 scheduler model. Consider the following example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=duplicate Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-4a7e50.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.31025, per_snippet_value: 0.31025 } error: '' info: '' assembled_snippet: C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C3 ... ``` What does it tell us? So wait, it can only execute ~3 x86 AVX YMM PXOR zero-idioms per cycle? That doesn't seem right. That's even less than there are pipes supporting this type of op. Now, second example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2418b5.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 1.00011, per_snippet_value: 1.00011 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` Now that's just worse. Due to the looping, the throughput completely plummeted, and now we can only do a single instruction/cycle!? That's not great. And final example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop --loop-body-size=1000 Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c402e2.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.167087, per_snippet_value: 0.167087 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` So if we merge the previous two approaches, do duplicate this single-instruction snippet 1000x (loop-body-size/instruction count in snippet), and run a loop with 1000 iterations over that duplicated/unrolled snippet, the measured throughput goes through the roof, up to 5.9 instructions/cycle, which finally tells us that this idiom is zero-cycle! Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D102522	2021-05-25 12:08:27 +03:00
Jonas Devlieghere	1ec03f3de5	[dsymutil] Emit an error when the Mach-O exceeds the 4GB limit. The Mach-O object file format is limited to 4GB because its used of 32-bit offsets in the header. It is possible for dsymutil to (silently) emit an invalid binary. Instead of having consumers deal with this, emit an error instead.	2021-05-24 16:29:06 -07:00
Jonas Devlieghere	7bf7b80b19	[dsymutil] Use EXIT_SUCCESS and EXIT_FAILURE (NFC)	2021-05-24 16:29:05 -07:00
Jonas Devlieghere	aab488ac2a	[dsymutil] Compute the output location once per input file (NFC) Compute the location of the output file just once outside the loop over the different architectures.	2021-05-24 16:29:05 -07:00
Hongtao Yu	00bfde723b	[NFC][CSSPGO]llvm-profge] Fix Build warning dueo to an attrbute usage.	2021-05-24 12:59:02 -07:00
Hongtao Yu	3b51b51877	[CSSPGO][llvm-profgen] Report samples for untrackable frames. Fixing an issue where samples collected for an untrackable frame is not reported. An untrackable frame refers to a frame whose caller is untrackable due to missing debug info or pseudo probe. Though the frame is connected to its parent frame through the frame pointer chain at runtime, the compiler cannot build the connection without debug info or pseudo probe. In such case we just need to report the untrackable frame as the base frame and all of its child frames. With more samples reported I'm seeing this improves the performance of an internal benchmark by 2.5%. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D102961	2021-05-24 12:39:12 -07:00
Philipp Krones	c2f819af73	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921	2021-05-23 14:15:23 -07:00
Sergey Dmitriev	1fb5278882	[llvm-strip] Add support for '--' for delimiting options from input files This will allow to use llvm-strip with file names that begin with dashes. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102825	2021-05-20 03:33:51 -07:00
Alexey Lapshin	081c62501e	[llvm-objcopy] Refactor CopyConfig structure. This patch prepares llvm-objcopy to move its implementation into a separate library. To make it possible it is necessary to minimize internal dependencies. Differential Revision: https://reviews.llvm.org/D99055	2021-05-20 13:14:51 +03:00
Simon Giesecke	0ddc75fd08	Add option to llvm-gsymutil to read addresses from stdin. Differential Revision: https://reviews.llvm.org/D102224	2021-05-20 06:10:35 +00:00
Patrick Holland	e5d59db469	[MCA] llvm-mca MCTargetStreamer segfault fix In order to create the code regions for llvm-mca to analyze, llvm-mca creates an AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions(). Within this function, both an MCAsmParser and MCTargetAsmParser are created so that MCAsmParser::Run() can be used to create the code regions for us. These parser classes were created for llvm-mc so they are designed to emit code with an MCStreamer and MCTargetStreamer that are expected to be setup and passed into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any code, an MCStreamerWrapper class gets created instead and passed into the MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides many of the emit methods to just do nothing. The exception is the emitInstruction() method which calls Regions.addInstruction(Inst). This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build our code regions, however there are a few directives which rely on the MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of those directives, a segfault will occur. In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca: .cv_fpo_proc .cv_fpo_setframe .cv_fpo_pushreg .cv_fpo_stackalloc .cv_fpo_stackalign .cv_fpo_endprologue .cv_fpo_endproc I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault. My proposed solution is to simply initialize an MCTargetStreamer after we initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the nulls() function. Since this needs to happen after the MCStreamerWrapper has been initialized, it needs to happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize within the main() function of llvm-mca. So this MCInstPrinter gets constructed within main() then passed into the parseCodeRegions() function as a parameter. (If you feel like it would be appropriate and possible to create the MCInstPrinter within the parseCodeRegions() function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.) My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it). As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools. Differential Revision: https://reviews.llvm.org/D102709	2021-05-19 18:36:10 +01:00
Mariusz Ceier	9383e9c1e6	Fix lld macho standalone build by including llvm/Config/llvm-config.h instead of llvm/Config/config.h lld/MachO/Driver.cpp and lld/MachO/SyntheticSections.cpp include llvm/Config/config.h which doesn't exist when building standalone lld. This patch replaces llvm/Config/config.h include with llvm/Config/llvm-config.h just like it is in lld/ELF/Driver.cpp and HAVE_LIBXAR with LLVM_HAVE_LIXAR and moves LLVM_HAVE_LIBXAR from config.h to llvm-config.h Also it adds LLVM_HAVE_LIBXAR to LLVMConfig.cmake and links liblldMachO2.so with XAR_LIB if LLVM_HAVE_LIBXAR is set. Differential Revision: https://reviews.llvm.org/D102084	2021-05-19 11:15:07 -04:00
Sergey Dmitriev	f24f140290	[llvm-objcopy] Add support for '--' for delimiting options from input/output files This will allow to use llvm-objcopy with file names that begin with dashes. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102665	2021-05-19 01:56:46 -07:00
Arthur Eubanks	0c509dbc7e	[NewPM] Add options to PrintPassInstrumentation To bring D99599's implementation in line with the existing PrintPassInstrumentation, and to fix a FIXME, add more customizability to PrintPassInstrumentation. Introduce three new options. The first takes over the existing "-debug-pass-manager-verbose" cl::opt. The second and third option are specific to -fdebug-pass-structure. They allow indentation, and also don't print analysis queries. To avoid more golden file tests than necessary, prune down the -fdebug-pass-structure tests. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102196	2021-05-18 20:59:35 -07:00
Lang Hames	49cdd62db5	[llvm-jitlink] Link libnetwork on Haiku in llvm-jitlink The system's network API is in libnetwork.so, so we explicitly need to link to them on Haiku. This patch is similar to https://reviews.llvm.org/D97633. Patch by Niels Reedijk. Thanks Niels! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D98405	2021-05-14 20:49:03 -07:00
Fangrui Song	4f05f4c8e6	[CMake][ELF] Link libLLVM.so and libclang-cpp.so with -Bsymbolic-functions llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html In an ELF shared object, a default visibility defined symbol is preemptible by default. This creates some missed optimization opportunities. -Bsymbolic-functions is more aggressive than our current -fvisibility-inlines-hidden (present since 2012) as it applies to all function definitions. It can * avoid PLT for cross-TU function calls && reduce dynamic symbol lookup * reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64 In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313 The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster. See the Linux kernel build result https://bugs.archlinux.org/task/70697 Note: the performance of -fno-semantic-interposition -Bsymbolic-functions libLLVM.so and libclang-cpp.so is close to a PIE binary linking against `libLLVM.a` and `libclang.a`. When the host compiler is Clang, -Bsymbolic-functions is the major contributor. On x86-64 (with GOTPCRELX) and ppc64 ELFv2, the GOT/TOC relocations can be optimized. Some implication: Interposing a subset of functions is no longer supported. (This is fragile on ELF and unsupported on Mach-O at all. For Mach-O we don't use `ld -interpose` or `-flat_namespace`) Compiling a program which takes the address of any LLVM function with `{gcc,clang} -fno-pic` and expects the address to equal to the address taken from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that llvm-project shouldn't have different behaviors depending on such pointer equality (as we've been using -fvisibility-inlines-hidden which applies to inline functions for a long time), but if we accidentally do, users should be aware that they should not make assumption on pointer equality in `-fno-pic` mode. See more on https://maskray.me/blog/2021-05-09-fno-semantic-interposition Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D102090	2021-05-13 13:44:57 -07:00
Martin Storsjö	b42fb6811e	[llvm-nm] Support the -V option, print that the tool is compatible with GNU nm This unlocks some codepaths in libtool. Differential Revision: https://reviews.llvm.org/D102321	2021-05-13 22:36:25 +03:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
Oliver Stannard	92260d7a18	Revert "[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions" This reverts commit `3bf1acab5b`. This is causing the test `gcov-shared-flush.c' to fail on the 2-stage aarch64 buildbots (https://lab.llvm.org/buildbot/#/builders/7/builds/2720).	2021-05-13 14:31:17 +01:00
Fangrui Song	3bf1acab5b	[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html In an ELF shared object, a default visibility defined symbol is preemptible by default. This creates some missed optimization opportunities. -fno-semantic-interposition can optimize -fPIC: * in Clang: avoid GOT/PLT cost for variable access/function calls to external linkage definition in the same TU * in GCC: enable interprocedural optimizations (including inlining) and avoid PLT See https://gist.github.com/MaskRay/2d4dfcfc897341163f734afb59f689c6 for more information. -Bsymbolic-functions is more aggressive than -fvisibility-inlines-hidden (present since 2012) as it applies to all function definitions. It can * avoid PLT for cross-TU function calls && reduce dynamic symbol lookup * reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64 With both options, the libLLVM.so and libclang-cpp.so performance should be closer to PIE binary linking against `libLLVM.a` and `libclang.a` (In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313 The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster. See the Linux kernel build result https://bugs.archlinux.org/task/70697 ) Some implication: Interposing a subset of functions is no longer supported. (This is fragile anyway and cannot really be supported. For Mach-O we don't use `ld -interpose`, so interposition is not supported on Mach-O at all.) Compiling a program which takes the address of any LLVM function with `{gcc,clang} -fno-pic` and expects the address to equal to the address taken from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that llvm-project shouldn't have different behaviors depending on such pointer equality (as we've been using -fvisibility-inlines-hidden which applies to inline functions for a long time), but if we accidentally do, users should be aware that they should not make assumption on pointer equality in `-fno-pic` mode. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D102090	2021-05-12 10:34:31 -07:00
Greg McGary	5a43901539	[llvm-objdump] Exclude __mh__header symbols during MachO disassembly `__mh_(execute\|dylib\|dylinker\|bundle\|preload\|object)_header` are special symbols whose values hold the VMA of the Mach header to support introspection. They are attached to the first section in `__TEXT`, even though their addresses are outside `__TEXT`, and they do not refer to code. It is normally harmless, but when the first section of `__TEXT` has no other symbols, `__mh__header` is considered by the disassembler when determing function boundaries. Since `__mh_*_header` refers to an address outside `__TEXT`, the boundary determination fails and disassembly quits. Since `__TEXT,__text` normally has symbols, this bug is obscured. Experiments placing `__stubs` and `__stub_helper` first exposed the bug, since neither has symbols. Differential Revision: https://reviews.llvm.org/D101786	2021-05-12 06:39:14 -07:00

1 2 3 4 5 ...

12876 Commits