llvm-project

Commit Graph

Author	SHA1	Message	Date
Jan Svoboda	622354a522	[llvm][ADT] Implement `BitVector::{pop_,}back` LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement. Currently, some users of `std::vector<bool>` cannot switch to `llvm::BitVector` because it doesn't implement the `pop_back()` and `back()` functions. To enable easy transition of `std::vector<bool>` users, this patch implements `llvm::BitVector::pop_back()` and `llvm::BitVector::back()`. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D117115	2022-01-21 14:50:53 +01:00
Juergen Ributzka	3025c3eded	Replace PlatformKind with PlatformType. The PlatformKind/PlatformType enums contain the same information, which requires them to be kept in-sync. This commit changes over to PlatformType as the sole source of truth, which allows the removal of the redundant PlatformKind. The majority of the changes were in LLD and TextAPI. Reviewed By: cishida Differential Revision: https://reviews.llvm.org/D117163	2022-01-13 09:23:49 -08:00
Leonard Grey	0f85393004	[MachO] Port call graph profile section and directive This ports the `.cg_profile` assembly directive and call graph profile section generation to MachO from COFF/ELF. Due to MachO section naming rules, the section is called `__LLVM,__cg_profile` rather than `.llvm.call-graph-profile` as in COFF/ELF. Support for llvm-readobj is included to facilitate testing. Corresponding LLD change is D112164 Differential Revision: https://reviews.llvm.org/D112160	2022-01-12 09:22:26 -05:00
Kazu Hirata	b932bdf59f	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-07 17:45:09 -08:00
Kazu Hirata	e5947760c2	Revert "[llvm] Remove redundant member initialization (NFC)" This reverts commit `fd4808887e`. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]	2022-01-03 11:28:47 -08:00
Kazu Hirata	fd4808887e	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-01 16:18:18 -08:00
Fangrui Song	c6bf71363a	[ELFAsmParser] Optimize hasPrefix with StringRef::consume_front	2021-12-30 00:16:03 -08:00
Sami Tolvanen	9a74c753fe	[ThinLTO][MC] Use conditional assignments for promotion aliases Inline assembly refererences to static functions with ThinLTO+CFI were fixed in D104058 by creating aliases for promoted functions. Creating the aliases unconditionally resulted in an unexpected size increase in a Chrome helper binary: https://bugs.chromium.org/p/chromium/issues/detail?id=1261715 This is caused by the compiler being unable to drop unused code now referenced by the alias in module-level inline assembly. This change adds a .set_conditional assembly extension, which emits an assignment only if the target symbol is also emitted, avoiding phantom references to functions that could have otherwise been dropped. This is an alternative to the solution proposed in D112761. Reviewed By: pcc, nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D113613	2021-12-10 12:21:37 -08:00
Phoebe Wang	d7c07f60b3	[X86][MS-InlineAsm] Make the constraint m to be simple place holder D113096 solved the "undefined reference to xxx" issue by adding constraint m for the global var. But it has strong side effect due to the symbol in the assembly being replaced with constraint variable. This leads to some lowering fails. https://godbolt.org/z/h3nWoerPe This patch fix the problem by use the constraint *m as place holder rather than real constraint. It has negligible effect for the existing code generation. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D115225	2021-12-10 09:29:38 +08:00
Tobias Burnus	c01c62c76c	[MC][ELF] Fix accepting abbreviated form with Type change Follow up to D92052 and D94072, exposed due to D107707 Many assemblers to permit that only the first .section contains all the attributes like '.lds_bss,"w",@nobits' and later section only use the name ('.lds_bss') inheriting those attributes from the first section. I turned out that the case that Type changed was missed when implementing it - and D107707 make it much more likely to hit that issue. That's fixed by this commit. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114717	2021-11-30 14:45:26 +00:00
Kazu Hirata	d14d7068b6	[llvm] Use StringRef::contains (NFC)	2021-10-23 08:45:27 -07:00
Anirudh Prasad	fb99424a6f	[SystemZ][z/OS] Introduce initial support for GOFF asm parser - Introduce a skeleton outline for the GOFFAsmParser - Before instantiating AsmParser/HLASMAsmParser, target specific asm parsers are attempted to be initialized first before proceeding. If it doesn't exist for a particular file type, we report a fatal error. - This patch allows to properly instantiate the HLASMAsmParser on z/OS, and ensures we can write lit tests and unit tests which will involve the instantiation of asm parsers, without an assert / fatal error. Reviewed By: uweigand, Kai Differential Revision: https://reviews.llvm.org/D110730	2021-10-01 10:29:14 -04:00
Peter Smith	b026ce9c8a	[MC] Add Subtarget for MAsmParser call to emitCodeAlignment The call to emitCodeAlignment was missing a STI which is required after D45962. emitCodeAlignment has a default parameter of 0 for MaxBytesToEmit. Explicitly passing 0 here was interpreted as as nullptr for the STI. This could possibly be avoided by taking STI as a const reference in emitCodeAlignment. Differential Revision: https://reviews.llvm.org/D109425	2021-09-08 13:28:24 +01:00
Peter Smith	5e71839f77	[MC] Add MCSubtargetInfo to MCAlignFragment In preparation for passing the MCSubtargetInfo (STI) through to writeNops so that it can use the STI in operation at the time, we need to record the STI in operation when a MCAlignFragment may write nops as padding. The STI is currently unused, a further patch will pass it through to writeNops. There are many places that can create an MCAlignFragment, in most cases we can find out the STI in operation at the time. In a few places this isn't possible as we are in initialisation or finalisation, or are emitting constant pools. When possible I've tried to find the most appropriate existing fragment to obtain the STI from, when none is available use the per module STI. For constant pools we don't actually need to use EmitCodeAlign as the constant pools are data anyway so falling through into it via an executable NOP is no better than falling through into data padding. This is a prerequisite for D45962 which uses the STI to emit the appropriate NOP for the STI. Which can differ per fragment. Note that involves an interface change to InitSections. It is now called initSections and requires a SubtargetInfo as a parameter. Differential Revision: https://reviews.llvm.org/D45961	2021-09-07 15:46:19 +01:00
Tozer	5c6f748cbc	[MCParser] Correctly handle CRLF line ends when consuming line comments Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=47983 The AsmLexer currently has an issue with lexing line comments in files with CRLF line endings, in which it reads the carriage return as being part of the line comment. This causes an error for certain valid comment layouts; this patch fixes this by excluding the carriage return from the line comment. Differential Revision: https://reviews.llvm.org/D90234	2021-08-17 15:52:51 +01:00
Simon Atanasyan	990e8025b5	[MC][ELF] Do not error on parsing .debug_* section directive for MIPS MIPS .debug_* sections should have SHT_MIPS_DWARF section type to distinguish among sections contain DWARF and ECOFF debug formats, but in assembly files these sections have SHT_PROGBITS (@progbits) type. Now assembler shows 'changed section type for ...' error when parsing `.section .debug_*,"",@progbits` directive for MIPS targets. The same problem exists for x86-64 target and this patch extends workaround implemented in D76151. The patch adds one more case when assembler ignores section types mismatch after `SwitchSection()` call. Differential Revision: https://reviews.llvm.org/D107707	2021-08-09 08:54:56 +03:00
Arthur Eubanks	ad25344620	[MC][CodeGen] Emit constant pools earlier Previously we would emit constant pool entries for ldr inline asm at the very end of AsmPrinter::doFinalization(). However, if we're emitting dwarf aranges, that would end all sections with aranges. Then if we have constant pool entries to be emitted in those same sections, we'd hit an assert that the section has already been ended. We want to emit constant pool entries before emitting dwarf aranges. This patch splits out arm32/64's constant pool entry emission into its own MCTargetStreamer virtual method. Fixes PR51208 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107314	2021-08-03 20:55:31 -07:00
Anirudh Prasad	a8cfa4b9bd	[SystemZ][z/OS] Initial code to generate assembly files on z/OS - This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target. - Only the .text and the .bss sections are added for now. - The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections. - This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target - Further improvements and additions will be made in future patches. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D106380	2021-07-27 11:29:15 -04:00
Eric Astor	a4e964a282	[ms] [llvm-ml] Fix macro case-insensitivity We previously had issues identifying macros not registered with a lowercase name. Reviewed By: mstorsjo, thakis Differential Revision: https://reviews.llvm.org/D106453	2021-07-22 15:50:52 -04:00
Eric Astor	5fba605896	[ms] [llvm-ml] Support built-in text macros Add support for all built-in text macros supported by ML64: @Date, @Time, @FileName, @FileCur, and @CurSeg. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104965	2021-07-21 11:44:09 -04:00
Eric Astor	4cbb912d75	[ms] [llvm-ml] Add support for numeric built-in symbols Support @Version and @Line as built-in symbols. For now, resolves @Version to 1427 (the same as for the VS 2019 release of ML.EXE). Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104964	2021-07-21 11:43:07 -04:00
Simon Tatham	e49985bb60	Remove unused parameter from parseMSInlineAsm. No implementation uses the `LocCookie` parameter at all. Errors are reported from inside that function by `llvm::SourceMgr`, and the instance of that at the clang call site arranges to pass the error messages back to a `ClangAsmParserCallback`, which is where the clang SourceLocation for the error is computed. (This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. But this particular change seems beneficial in its own right.) Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D105490	2021-07-12 15:07:03 +01:00
Eric Astor	678211de6d	[ms] [llvm-ml] Standardize blocking of lexical substitution In MASM, the ifdef family of directives treats its argument literally, without expanding it as a text macro. Add support for this, and also replace the special handling that was previously used for echo. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104196	2021-07-02 14:17:37 -04:00
Jinsong Ji	bf64210fd8	[AIX] Add dummy XCOFF MCAsmParserExtension Implement XCOFFMCAsmParser so that we can use MC to parse inline asm. The directives and storage mapping classes will be added later iteratively. Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D105259	2021-07-02 16:12:21 +00:00
Jonas Paulsson	7aef99351a	[MCStreamer] Move emission of attributes section into MCELFStreamer Enable the emission of a GNU attributes section by reusing the code for emitting the ARM build attributes section. The GNU attributes follow the exact same section format as the ARM BuildAttributes section, so this can be factored out and reused for GNU attributes generally. The immediate motivation for this is to emit a GNU attributes section for the vector ABI on SystemZ (https://reviews.llvm.org/D105067). Review: Logan Chien, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D102894	2021-06-30 16:00:27 -05:00
Anirudh Prasad	2dca0b5a1c	[AsmParser][SystemZ][z/OS] Fix hanging scenario in HLASMAsmParser class - In the caller of the overridden `parseStatement` function (i.e. the `AsmParser::Run()`) in the case of an error and if we're not at the start of the statement, we "eat" up until the end of the current statement, so we don't have to process it again. - However, in the HLASMAsmParser class what's happening is that, if an error occurs at the very start of the statement (for example, you invoke the HLASMAsmParser to parse a gnu directive), we will error out, but we never really progress in terms of the next token in the statement to parse. We simply keep looping processing the same error over and over again (partly because we're at the start of the statement) - To remedy this, when the `parseAsHLASMLabel` function fails, before returning, we "eat" until the end of the statement function, so we don't process it anymore. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D104869	2021-06-28 12:47:08 -04:00
Eric Astor	c8d0d8a8a1	[ms] [llvm-ml] Add support for ALIGN, EVEN, and ORG directives Match ML.EXE's behavior for ALIGN, EVEN, and ORG directives both at file level and in STRUCTs. We currently reject negative offsets passed to ORG inside STRUCTs (in ML.EXE and ML64.EXE, they wrap around as for an unsigned 32-bit integer). Also, if a STRUCT is declared using an ORG directive, no value of that type can be defined. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92507	2021-06-25 17:19:45 -04:00
Martin Storsjö	42f74e8249	[llvm] Rename StringRef _lower() method calls to _insensitive() This is a mechanical change. This actually also renames the similarly named methods in the SmallString class, however these methods don't seem to be used outside of the llvm subproject, so this doesn't break building of the rest of the monorepo.	2021-06-25 00:22:01 +03:00
Anirudh Prasad	631362665c	[AsmParser][SystemZ][z/OS] Support for emitting labels in upper case - Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it. - However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`) - To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed. Reviewed By: abhina.sreeskantharajan, uweigand Differential Revision: https://reviews.llvm.org/D104715	2021-06-24 12:50:11 -04:00
RamNalamothu	167e7afcd5	Implement DW_CFA_LLVM_* for Heterogeneous Debugging Add support in MC/MIR for writing/parsing, and DebugInfo. This is part of the Extensions for Heterogeneous Debugging defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html Specifically the CFI instructions implemented here are defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D76877	2021-06-14 08:51:50 +05:30
Eric Astor	d81c059c3e	[ms] [llvm-ml] Fix capitalization of the ignored CPU directives These directives are matched in lowercase, so make sure to use lowercase for their P suffix. Differential Revision: https://reviews.llvm.org/D104206	2021-06-13 18:34:42 -04:00
Eric Astor	f03a3caac5	[ms] [llvm-ml] Warn on command-line redefinition If a macro is defined on the command line and then overridden in the source code, this is likely to be an error in the user's build system. We should warn on this. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104008	2021-06-10 14:20:21 -04:00
Eric Astor	00ebbedd1c	[ms] [llvm-ml] Make variable redefinition match ML.EXE MASM specifies that all variable definitions are redefinable, except for EQU definitions to expressions. (TEXTEQU is unspecified, but appears to be fully redefinable as well.) Also, in practice, ML.EXE allows redefinitions where the value doesn't change. Make variable redefinition possible for text macros, suppressing expansion if written as the first argument to an EQU or TEXTEQU directive. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D103993	2021-06-10 08:36:15 -04:00
Eric Astor	c8d6e67d53	[ms] [llvm-ml] Fix parity errors in error handling for INCLUDE directive Also adds basic testing for "include" directive. Differential Revision: https://reviews.llvm.org/D103980	2021-06-09 13:34:36 -04:00
Eric Astor	dc0c3fe5f3	[ms] [llvm-ml] Disambiguate size directives and variable declarations MASM allows statements of the form: <VAR> DWORD 5 to declare a variable with name <VAR>, while: call dword ptr [<value>] is a valid instruction. To disambiguate, we recognize size directives by the trailing "ptr" token. As discussed in https://lists.llvm.org/pipermail/llvm-dev/2021-May/150774.html Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D103257	2021-06-08 15:44:31 -04:00
Anirudh Prasad	e52007cac4	[SystemZ][z/OS] Stricter condition for HLASM class instantiation - A lot of lit tests simply specify the arch minus the triple. On z/OS, this could result in a scenario of some-other-triple-unknown-ibm-zos. This points to an incorrect triple + arch combo. - To prevent this, isOSzOS change is switched in favour of isOSBinFormatGOFF. - This is because, the GOFF format is set only if the triple is systemz and if the operating system is GOFF. And currently, there are no other architectures/os's using the GOFF file format. - An argument could be made that the problematic tests be fixed to explicitly specify the arch-vendor-triple string, but there's a large number of these tests, and adding this stricter scope ensures that we aren't instantiating the incorrect instance of the AsmParser for other platforms when run on z/OS. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D103343	2021-06-01 15:56:50 -04:00
Anirudh Prasad	b8dcd920ec	[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 2 - This patch is the second (and hopefully final) part of providing HLASM syntax for inline asm statements for z/OS to LLVM (continuing on from https://reviews.llvm.org/D98276) - This second part deals with providing label support - As mentioned in https://reviews.llvm.org/D98276, if the first token is not a space we process the first token as a label, and the remaining tokens as a possible machine instruction - To achieve this, a new `parseAsHLASMLabel` function is introduced. This function processes the first token, validates whether it is an "acceptable" label according to HLASM standards, and then emits it - After handling and emitting the label, call the `parseAsMachineInstruction` instruction to process the remaining tokens as a machine instruction. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D103320	2021-05-31 11:27:02 -04:00
Anirudh Prasad	f076da66b9	[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 1 - This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html \| main RFC here ]]) - This patch in particular introduces HLASM Parser support for Z machine instructions. - The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate. - The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well. The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf \| HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format): ``` <TokA><spaces.><TokB><spaces.><TokC><spaces.*><TokD> ``` 1. TokA is referred to as the Name Entry. This token is optional 2. TokB is referred to as the Operation Entry. This token is mandatory. 3. TokC is referred to as the Operand Entry. This token is mandatory 4. TokD is referred to as the Remarks Entry. This token is optional - If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on. - If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field - TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error. - TokD if provided is taken as is, and emitted as a comment. The following additional approach was examined, but not taken: - Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage. Testing: - This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on. - Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D98276	2021-05-19 11:05:30 -04:00
Sam Clegg	3041b16f73	[WebAssembly] Add TLS data segment flag: WASM_SEG_FLAG_TLS Previously the linker was relying solely on the name of the segment to imply TLS. Differential Revision: https://reviews.llvm.org/D102202	2021-05-12 13:31:02 -07:00
Sam Clegg	3b8d2be527	Reland: "[lld][WebAssembly] Initial support merging string data" This change was originally landed in: `5000a1b4b9` It was reverted in: `061e071d8c` This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 16:03:38 -07:00
Nico Weber	061e071d8c	Revert "[lld][WebAssembly] Initial support merging string data" This reverts commit `5000a1b4b9`. Breaks tests, see https://reviews.llvm.org/D97657#2749151 Easily repros locally with `ninja check-llvm-mc-webassembly`.	2021-05-10 18:28:28 -04:00
Sam Clegg	5000a1b4b9	[lld][WebAssembly] Initial support merging string data This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 13:15:12 -07:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Anirudh Prasad	ae2aef1361	[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM - As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information) - This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set - This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings` Reviewed By: abhina.sreeskantharajan, Kai Differential Revision: https://reviews.llvm.org/D101660	2021-05-05 10:21:55 -04:00
Fangrui Song	7cac6a9d7a	[MC] Add MCAsmParser::parseComma to improve diagnostics llvm-mc will error "expected comma" instead of "unexpected token".	2021-05-04 14:13:19 -07:00
Fangrui Song	7b1e1fccb0	[MC] Don't capitalize a floating point diagnostic	2021-05-04 13:40:26 -07:00
Fangrui Song	3d473ae72e	[MC] Remove unneeded "in '.xxx' directive" from diagnostics The directive name is not useful because the next line replicates the error line which includes the directive.	2021-05-04 13:30:29 -07:00
Anirudh Prasad	ca02fab7e7	[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax () for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " " which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "" preceding the token. When you have a " * " what’s accepted is: ``` <space>.{.} -> <pc-rel-insn> 0 [+\|-][constant-value] -> <pc-rel-insn> [+\|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func+4 is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+4+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,-4+8+func is not allowed ( cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987	2021-05-03 14:58:24 -04:00
Anirudh Prasad	ded0a70aeb	[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS - Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter) - However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch) - The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not. - It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`) Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100975	2021-04-29 11:58:54 -04:00
Anirudh Prasad	07b0a72d8e	[AsmParser][SystemZ][z/OS] Use updated framework in AsmLexer to accept special tokens as Identifiers - Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers. - These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ. - In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_\|$\|@\|#] are already supported as "part" of an Identifier. - The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100959	2021-04-28 15:43:24 -04:00
Anirudh Prasad	8f6185c713	[AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string. - Previously, https://reviews.llvm.org/D72680 introduced a new attribute called `AllowSymbolAtNameStart` (in relation to the MAsmParser changes) in `MCAsmInfo.h` which (according to the comment in the header) allows the following behaviour: ``` /// This is true if the assembler allows $ @ ? characters at the start of /// symbol names. Defaults to false. ``` - However, the usage of this field in AsmLexer.cpp doesn't seem completely accurate* for a couple of reasons. ``` default: if (MAI.doesAllowSymbolAtNameStart()) { // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]* if (!isDigit(CurChar) && isIdentifierChar(CurChar, MAI.doesAllowAtInName(), AllowHashInIdentifier)) return LexIdentifier(); } ``` 1. The Dollar and At tokens, when occurring at the start of the string, are treated as separate tokens (AsmToken::Dollar and AsmToken::At respectively) and not lexed as an Identifier. 2. I'm not too sure why `MAI.doesAllowAtInName()` is used when `AllowAtInIdentifier` could be used. For X86 platforms, afaict, this shouldn't be an issue, since the `CommentString` attribute isn't "@". (alternatively the call to the setter can be set anywhere else as needed). The `AllowAtInName` does have an additional important meaning, but in the context of AsmLexer, shouldn't mean anything different compared to `AllowAtInIdentifier` My proposal is the following: - Introduce 3 new fields called `AllowQuestionTokenAtStartOfString`, `AllowDollarTokenAtStartOfString` and `AllowAtTokenAtStartOfString` in MCAsmInfo.h which will encapsulate the previously documented behaviour of "allowing $, @, ? characters at the start of symbol names") - Introduce these fields where "$", "@" are lexed, and treat them as identifiers depending on whether `Allow[Dollar\|At]TokenAtStartOfString` is set. - For the sole case of "?", append it to the existing logic for treating a "default" token as an Identifier. z/OS (HLASM) will also make use of some of these fields in follow up patches. completely accurate* - This was based on the comments and the intended behaviour the code. I might have completely misinterpreted it, and if that is the case my sincere apologies. We can close this patch if necessary, if there are no changes to be made :) Depends on https://reviews.llvm.org/D99374 Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D99889	2021-04-21 10:21:09 -04:00
Anirudh Prasad	6ddd8c28b7	[AsmParser][SystemZ][z/OS] Add support to AsmLexer to accept HLASM style integers - Add support for HLASM style integers. These are the decimal integers [0-9]. - HLASM does not support the additional prefixed integers like, `0b`, `0x`, octal integers and Masm style integers. - To achieve this, a field `LexHLASMStyleIntegers` (similar to the `LexMasmStyleIntegers` field) is introduced in `MCAsmLexer.h` as well as a corresponding setter. Note: This field could also go into MCAsmInfo.h. I used the previous precedent set by the `LexMasmIntegers` field. Depends on https://reviews.llvm.org/D99286 Reviewed By: epastor Differential Revision: https://reviews.llvm.org/D99374	2021-04-13 15:29:37 -04:00
Anirudh Prasad	f7eec83932	[AsmParser][SystemZ][z/OS] Add in support to allow use of additional comment strings. - Currently, MCAsmInfo provides a CommentString attribute, that various targets can set, so that the AsmLexer can appropriately lex a string as a comment based on the set value of the attribute. - However, AsmLexer also supports a few additional comment syntaxes, in addition to what's specified as a CommentString attribute. This includes regular C-style block comments (/* ... /), regular C-style line comments (// .... ) and #. While I'm not sure as to why this behaviour exists, I am assuming it does to maintain backward compatibility with GNU AS (see https://sourceware.org/binutils/docs/as/Comments.html#Comments for reference) For example: Consider a target which sets the CommentString attribute to ''. The following strings are all lexed as comments. ``` "# abc" -> comment "// abc" -> comment "/* abc / -> comment " abc" -> comment ``` - In HLASM however, only "*" is accepted as a comment string, and nothing else. - To achieve this, an additional attribute (`AllowAdditionalComments`) has been added to MCAsmInfo. If this attribute is set to false, then only the string specified by the CommentString attribute is used as a possible comment string to be lexed by the AsmLexer. The regular C-style block comments, line comments and "#" are disabled. As a final note, "#" will still be treated as a comment, if the CommentString attribute is set to "#". Depends on https://reviews.llvm.org/D99277 Reviewed By: abhina.sreeskantharajan, myiwanch Differential Revision: https://reviews.llvm.org/D99286	2021-04-13 11:15:09 -04:00
LemonBoy	edb18ea5a9	[AsmParser] Recognize more escaped characters between single quotes The GNU AS manual states the following about single-character constants enclosed within single quotes: > Some backslash escapes apply to characters, \b, \f, \n, \r, \t, and \" with the same meaning as for strings, plus \' for a single quote. Add two more characters to the switch handling this case to match GAS behaviour, plus a test to make sure nothing regresses. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99609	2021-04-08 09:59:37 +02:00
Abhina Sreeskantharajan	82b3e28e83	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text Problem: On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable. Solution: This patch adds two new flags - OF_CRLF which indicates that CRLF translation is used. - OF_TextWithCRLF = OF_Text \| OF_CRLF indicates that the file is text and uses CRLF translation. Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF. So this is the behaviour per platform with my patch: z/OS: OF_None: open in binary mode OF_Text : open in text mode OF_TextWithCRLF: open in text mode Windows: OF_None: open file with no carriage return OF_Text: open file with no carriage return OF_TextWithCRLF: open file with carriage return The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set. ``` if (Flags & OF_CRLF) CrtOpenFlags \|= _O_TEXT; ``` These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows. ./llvm/lib/Support/raw_ostream.cpp ./llvm/lib/TableGen/Main.cpp ./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp ./llvm/unittests/Support/Path.cpp ./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp ./clang/lib/Frontend/CompilerInstance.cpp ./clang/lib/Driver/Driver.cpp ./clang/lib/Driver/ToolChains/Clang.cpp Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99426	2021-04-06 07:23:31 -04:00
Ricky Taylor	4db18d62af	[M68k] Add support for Motorola literal syntax to AsmParser These look like $00A0cf for hex and %001010101 for binary. They are used in Motorola assembly syntax. Differential Revision: https://reviews.llvm.org/D98519	2021-04-05 20:02:29 +01:00
Eric Astor	0499a9d688	[ms] [llvm-ml] Accept /WX to signal that warnings should be fatal. Define -fatal-warnings to make warnings fatal, and accept /WX as an ML.EXE compatible alias for it. Also make sure that if Warning() returns true, we always treat it as an error. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92504	2021-04-02 15:13:20 -04:00
Eric Astor	15ec0ad77a	[ms] [llvm-ml] Fix case-sensitivity for variables and textmacros Make variables and text-macro references case-insensitive, to match ml.exe. Also improve error handling for text-macro expansion. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92503	2021-04-02 14:08:02 -04:00
Anirudh Prasad	7b921a6747	[AsmParser][SystemZ][z/OS] Add in support to accept "#" as part of an Identifier token - This patch adds in support to accept the "#" character as part of an Identifier. - This support is needed especially for the HLASM dialect since "#" is treated as part of the valid "Alphabet" range - The way this is done is by making use of the previous precedent set by the `AllowAtInIdentifier` field in `MCAsmLexer.h`. A new field called `AllowHashInIdentifier` is introduced. - The static function `IsIdentifierChar` is also updated to accept the `#` character if the `AllowHashInIdentifier` field is set to true. Note: The field introduced in `MCAsmLexer.h` could very well be moved to `MCAsmInfo.h`. I'm not opposed to it. I decided to put it in `MCAsmLexer` since there seems to be some sort of precedent already with `AllowAtInIdentifier`. Reviewed By: abhina.sreeskantharajan, nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D99277	2021-04-01 11:24:43 -04:00
Konstantin Zhuravlyov	f4ace63737	AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names Differential Revision: https://reviews.llvm.org/D95638	2021-03-24 11:54:05 -04:00
Anirudh Prasad	301d9261b7	[AsmParser][SystemZ][z/OS] Re-introduce HLASM comment syntax - https://reviews.llvm.org/rGb605cfb336989705f391d255b7628062d3dfe9c3 was reverted due to sanitizer bugs in the introduced unit-test (specifically in the Address sanitizer https://lab.llvm.org/buildbot/#/builders/5/builds/5697) - This patch attempts to rectify that, as well as re-factor parts of the test - The issue was previously, within the `setupCallToAsmParser` function in the unit-test, `SrcMgr` was declared as a local variable. `SrcMgr` owns a unique pointer. Since the variable goes out of scope at the end of the function, the unique pointer is released. - This patch, moves the declaration of the `SrcMgr` variable to a class field, since the scope will remain until the class's destructor is invoked (which in this case is at the end of the unit test) - Furthermore, this patch also moves the `MCContext Ctx` declaration from a local variable instance inside a function, to a unique pointer class field. This ensures the instantiation of the MCContext remains until the tear down of the test. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D99004	2021-03-24 10:17:00 -04:00
Yuanfang Chen	b4a8c0ebb6	[LTO][MC] Discard non-prevailing defined symbols in module-level assembly This is the alternative approach to D96931. In LTO, for each module with inlineasm block, prepend directive ".lto_discard <sym>, <sym>*" to the beginning of the inline asm. ".lto_discard" is both a module inlineasm block marker and (optionally) provides a list of symbols to be discarded. In MC while emitting for inlineasm, discard symbol binding & symbol definitions according to ".lto_disard". Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98762	2021-03-18 15:33:42 -07:00
Anirudh Prasad	9f5da80013	Revert "[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax"" This reverts commit `b605cfb336`. Differential Revision: https://reviews.llvm.org/D98744	2021-03-16 18:39:04 -04:00
Anirudh Prasad	b605cfb336	[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax" - Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 \| reverted ]] as it broke when building the unit tests when shared libs on. - This patch reverts the "revert" and makes two minor changes - The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off - The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests" Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D98666	2021-03-16 17:11:46 -04:00
Hubert Tong	4f9cc1512d	Revert "[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax" This reverts commit `bcdd40f802`. See https://reviews.llvm.org/D98543.	2021-03-12 14:48:00 -05:00
Anirudh Prasad	bcdd40f802	[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax - This patch adds in support for the ordinary HLASM comment syntax asm statements (Reference - Chapter 7, Comment Statements, Ordinary Comment Statements) - In brief, the ordinary comment syntax if used, must begin with the "" character - To achieve this, this patch makes use of the CommentString attribute provided in the base MCAsmInfo class - In the SystemZMCAsmInfo class, the CommentString attribute was set to "" based on the assembler dialect - Furthermore, a new attribute RestrictCommentString, is provided to only treat a string as a comment if it appears at the start of the asm statement. Example: "jo -4" is valid in HLASM (jump back 4 bytes from current point - similar to jo -4 in gnu asm) and we don't want "-4" to be treated as a comment. - RFC for HLASM Parser support implementation: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html Reviewed By: scott.linder, Kai Differential Revision: https://reviews.llvm.org/D97703	2021-03-12 11:56:11 -05:00
Fangrui Song	45f949ee46	[MC] Migrate some parseToken(AsmToken::EndOfStatement, ...) to parseEOL()	2021-03-06 19:25:22 -08:00
Fangrui Song	bb6732cf62	[MC] Add parseEOL() overload and migrate some parseToken(AsmToken::EndOfStatement) to parseEOL() For many directives, the following diagnostics * `error: unexpected token` * `error: unexpected token in '.abort' directive"` are replaced with `error: expected newline`. `unexpected token` may make the user think a different token is needed. `expected newline` is clearer about the expected token. For `in '...' directive`, the directive name is not useful because the next line replicates the error line which includes the directive.	2021-03-06 17:45:23 -08:00
Fangrui Song	e5eb3e3836	[MC] Parse end-of-line for .addrsig & .addrsig_sym	2021-03-06 16:26:27 -08:00
Fangrui Song	fd785f98aa	[MC] Parse end-of-line for .cfi_* directives Otherwise MCAsmStreamer will emit duplicate newlines.	2021-03-06 16:20:55 -08:00
Fangrui Song	d96af2ed2d	[MC] Support .symver , , remove As a resolution to https://sourceware.org/bugzilla/show_bug.cgi?id=25295 , GNU as from binutils 2.35 supports the optional third argument for the .symver directive. 'remove' for a non-default version is useful: `.symver def_v1, def@v1, remove` => def_v1 is not retained in the symbol table. Previously the user has to strip the original symbol or specify a `local:` version node in a version script to localize the symbol. `.symver def, def@@v1, remove` and `.symver def, def@@@v1, remove` are supported as well, though they are identical to `.symver def, def@@@v1`. local/hidden are not useful so this patch does not implement them.	2021-03-06 15:23:02 -08:00
Benjamin Kramer	955365524a	[MCParser] Bring back srcmanager diagnostics in AsmParser AsmParser may have no LLVMContext attached to it, which means after `5de2d189e6` everything goes to stderr. Restore the old behavior.	2021-03-02 13:43:03 +01:00
Yuanfang Chen	5de2d189e6	[Diagnose] Unify MCContext and LLVMContext diagnosing The situation with inline asm/MC error reporting is kind of messy at the moment. The errors from MC layout are not reliably propagated and users have to specify an inlineasm handler separately to get inlineasm diagnose. The latter issue is not a correctness issue but could be improved. * Kill LLVMContext inlineasm diagnose handler and migrate it to use DiagnoseInfo/DiagnoseHandler. * Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This covers use cases like inlineasm, MC, and any clients using SourceMgr. * Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all use cases, only one of them is used. * If LLVMContext is available, let MCContext uses LLVMContext's diagnose handler; if LLVMContext is not available, MCContext uses its own default diagnose handler which just prints SMDiagnostic. * Change a few clients(Clang, llc, lldb) to use the new way of reporting. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D97449	2021-03-01 15:58:37 -08:00
Fangrui Song	880c9c56c1	[MC] Allow .cfi_sections with empty section list GNU as supports this. This mode silently ignores .cfi_startproc/.cfi_endproc and .cfi_* in between. Also drop a diagnostic `in '.cfi_sections' directive`: the diagnostic already includes the line and it is clear the line is a `.cfi_sections` directive.	2021-02-25 22:29:49 -08:00
Jon Roelofs	7f6e331645	Support `#pragma clang section` directives on MachO targets rdar://59560986 Differential Revision: https://reviews.llvm.org/D97233	2021-02-25 09:30:10 -08:00
Petr Hosek	16af973933	[MC][ELF] Support for zero flag section groups This change introduces support for zero flag ELF section groups to LLVM. LLVM already supports COMDAT sections, which in ELF are a special type of ELF section groups. These are generally useful to enable linker GC where you want a group of sections to always travel together, that is to be either retained or discarded as a whole, but without the COMDAT semantics. Other ELF assemblers already support zero flag ELF section groups and this change helps us reach feature parity. Differential Revision: https://reviews.llvm.org/D95851	2021-02-16 14:23:40 -08:00
Sam Clegg	7e7cfce0b6	[WebAssembly] Use data sections by default This allows data sections that don't start with `.data` to be used/created. Without this, clang's `__attribute__((section("foo")))` would generate assembly that would not parse. Differential Revision: https://reviews.llvm.org/D96233	2021-02-09 11:03:06 -08:00
Fangrui Song	3e8ab54ba0	[MC] Upgrade DWARF version to 5 upon .file 0 Without `-dwarf-version`, llvm-mc uses the default `MCContext::DwarfVersion` 4. Without `-gdwarf-N`, Clang cc1as uses `clang::driver::ToolChain::GetDefaultDwarfVersion` which is 4 on many toolchains. Note: `clang -c` can synthesize .debug_info without -g. There is currently a MCParser warning upon `.file 0` and MCParser errors upon `.loc 0` if the DWARF version is less than 5. This causes friction to the following usage: ``` clang -S -g -gdwarf-5 a.c // MC warning due to .file 0, MC error due to .loc 0 clang -c a.s llvm-mc -filetype=obj a.s ``` My idea is that we can just upgrade `MCContext::DwarfVersion` to 5 upon `.file 0` to make the above commands work. The downside is that for an explicit version `clang -c -gdwarf-4 a.s`, it can be argued that the new behavior drops the probably intended diagnostic. I think the downside is small because in most cases DWARF version for an assembly action should either match the original compile action or be omitted. Ongoing discussion taking a similar action for GNU as: https://sourceware.org/pipermail/binutils/2021-January/114980.html Differential Revision: https://reviews.llvm.org/D94882	2021-02-02 09:41:05 -08:00
Fangrui Song	1477ed8465	[MC] Support SHF_GNU_RETAIN as section flag 'R' On Linux target triples, GNU as sets EI_OSABI to ELFOSABI_GNU when SHF_GNU_RETAIN is used。 On `--freebsd`, it usually sets EI_OSABI to ELFOSABI_FREEBSD. GNU ld respects SHF_GNU_RETAIN only for ELFOSABI_FREEBSD/ELFOSABI_GNU. https://sourceware.org/bugzilla/show_bug.cgi?id=27282 MC doesn't set ELFOSABI_GNU for SHF_GNU_RETAIN/STB_GNU_UNIQUE/STT_GNU_IFUNC. MC assembled object files do not have special semantics in GNU ld. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D95730	2021-02-02 09:34:09 -08:00
Tobias Burnus	70ea15b889	[MC][ELF] Fix accepting abbreviated form with sh_flags and sh_entsize Followup to D92052 as I missed an issue as shown via GCC bug https://gcc.gnu.org/PR97827, namely: (e.g.) ".rodata." implies ELF::SHF_ALLOC. Crossref: - D73999 / commit `75af9da755` added for LLVM 11 a check that sh_flags and sh_entsize (and sh_type) changes are an error, in line with GNU assembler. - D92052 / commit `1deff4009e` permitted the abbreviated form which many assemblers accept and GCC generates: while the first .section contains the flags and entsize, subsequent sections simply contain the name without repeating entsize or flags. However, the latter patch missed in the check that some flags are automatically set, e.g. '.rodata." implies ELF::SHF_ALLOC. Related https://bugs.llvm.org/show_bug.cgi?id=48201 Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D94072	2021-01-28 14:54:43 +00:00
Kazu Hirata	cfa241680f	[llvm] Don't include StringSwitch.h where unnecessary (NFC)	2021-01-21 19:59:48 -08:00
Jian Cai	9dfeec8530	Reland "[AsmParser] make .ascii support spaces as separators" This relands commit `e0963ae274`, which was reverted on commit `82c4153e66` due to a test failure, which turned out to be a false positive.	2021-01-14 17:51:47 -08:00
Jian Cai	82c4153e66	Revert "[AsmParser] make .ascii support spaces as separators" This reverts commit `e0963ae274`. The change breaks some GDB tests. Revert it while we investigate.	2021-01-13 14:38:22 -08:00
Kazu Hirata	1d0bc05551	[llvm] Use llvm::append_range (NFC)	2021-01-06 18:27:33 -08:00
Jian Cai	e0963ae274	[AsmParser] make .ascii support spaces as separators Currently the integrated assembler only allows commas as the separator between string arguments in .ascii. This patch adds support to using space as separators and make IAS consistent with GNU assembler. Link: https://github.com/ClangBuiltLinux/linux/issues/1196 Reviewed By: nickdesaulniers, jrtc27 Differential Revision: https://reviews.llvm.org/D91460	2020-12-20 22:41:00 -08:00
Fangrui Song	01d1de8196	[MC] Reject byte alignment if larger than or equal to 232 This is consistent with the resolution to power-of-2 alignments. Otherwise, emitCodeAlignment and emitValueToAlignment cannot handle alignments larger than 232 and will trigger assertion failure (PR35218). Note: GNU as as of 2.35 will use 1 for such a large byte `.align`	2020-12-20 14:17:00 -08:00
Tobias Burnus	1deff4009e	[MC][ELF] Accept abbreviated form with sh_flags and sh_entsize D73999 / commit `75af9da755` added for LLVM 11 a check that sh_flags and sh_entsize (and sh_type) changes are an error, in line with GNU assembler. However, GNU assembler accepts and GCC generates an abbreviated form: while the first .section contains the flags and entsize, subsequent sections simply contain the name without repeating entsize or flags. Do likewise for better compatibility. See https://bugs.llvm.org/show_bug.cgi?id=48201 Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D92052	2020-12-11 16:45:45 +00:00
Hongtao Yu	705a4c149d	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 17:29:28 -08:00
Derek Schuff	8d396acac3	[WebAssembly] Support COMDAT sections in assembly syntax This CL changes the asm syntax for section flags, making them more like ELF (previously "passive" was the only option). Now we also allow "G" to designate COMDAT group sections. In these sections we set the appropriate comdat flag on function symbols, and also avoid auto-creating a new section for them. This also adds asm-based tests for the changes D92691 to go along with the direct-to-object tests. Differential Revision: https://reviews.llvm.org/D92952 This is a reland of rG4564553b8d8a with a fix to the lit pipeline in llvm/test/MC/WebAssembly/comdat.ll	2020-12-10 16:43:59 -08:00
Derek Schuff	dd1aa4fdd8	Revert "[WebAssembly] Support COMDAT sections in assembly syntax" This reverts commit `4564553b8d`. It broke several buildbots.	2020-12-10 15:55:33 -08:00
Mitch Phillips	7ead5f5aa3	Revert "[CSSPGO] Pseudo probe encoding and emission." This reverts commit `b035513c06`. Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269	2020-12-10 15:53:39 -08:00
Mitch Phillips	b955eb688d	Revert "[NFC] Fix a gcc build break by using an explict constructor." This reverts commit `248b279cf0`. Reason: Dependency of patch that broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269	2020-12-10 15:53:38 -08:00
Derek Schuff	4564553b8d	[WebAssembly] Support COMDAT sections in assembly syntax This CL changes the asm syntax for section flags, making them more like ELF (previously "passive" was the only option). Now we also allow "G" to designate COMDAT group sections. In these sections we set the appropriate comdat flag on function symbols, and also avoid auto-creating a new section for them. This also adds asm-based tests for the changes D92691 to go along with the direct-to-object tests. Differential Revision: https://reviews.llvm.org/D92952	2020-12-10 14:46:24 -08:00
Hongtao Yu	248b279cf0	[NFC] Fix a gcc build break by using an explict constructor.	2020-12-10 11:21:40 -08:00
Hongtao Yu	b035513c06	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 09:50:08 -08:00
Scott Linder	19c56e11fa	[MC] Fix ICE with non-newline terminated input There is an explicit option for the lexer to support this, but we crash when `-preserve-comments` is enabled because it checks for `getTok().getString().empty()` to detect the case. This doesn't work currently because the lexer reports this case as a string of length 1, containing a null byte. Change the lexer to instead report this case via an empty string, as the null terminator isn't logically a part of the textual input, and the check for `.empty()` seems natural and obvious in the calling code. Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D92681	2020-12-09 23:39:32 +00:00
Fangrui Song	9c53b2adc8	[MC] Delete unused declarations Notes: * llvm::createAsmStreamer: it has been moved to TargetRegistry.h * (anon ns)::WasmObjectWriter::updateCustomSectionRelocations: remnant of D46335 * COFFAsmParser::ParseSEHRegisterNumber: remnant of D66625 * llvm::CodeViewContext::isValidCVFileNumber: accidentally added by r279847	2020-12-06 15:36:39 -08:00
Scott Linder	d55d6806ad	[MC] Consume EndOfStatement in .cfi_{sections,endproc} Previously these directives were always interpreted as having an extra blank line after them. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D92612	2020-12-04 22:30:29 +00:00
Eric Astor	c64037b784	[ms] [llvm-ml] Support command-line defines Enable command-line defines as textmacros Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D90059	2020-12-01 18:06:05 -05:00
Eric Astor	abef659a45	[ms] [llvm-ml] Implement the statement expansion operator If prefaced with a %, expand text macros and macro functions in any statement. Also, prevent expanding text macros in the message of an ECHO directive unless expanded explicitly by the statement expansion operator. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D89740	2020-11-30 14:33:24 -05:00

1 2 3 4 5 ...

1151 Commits