llvm-project

Commit Graph

Author	SHA1	Message	Date
Chen Zheng	c941d925b0	[MachineCycle][NFC] add a cache for block and its top level cycle This solves https://github.com/llvm/llvm-project/issues/57664 Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D134019	2022-09-21 01:31:04 -04:00
Craig Topper	70a64fe7b1	[RISCV] Remove support for the unratified Zbt extension. This extension does not appear to be on its way to ratification. Out of the unratified bitmanip extensions, this one had the largest impact on the compiler. Posting this patch to start a discussion about whether we should remove these extensions. We'll talk more at the RISC-V sync meeting this Thursday. Reviewed By: asb, reames Differential Revision: https://reviews.llvm.org/D133834	2022-09-20 20:26:48 -07:00
Shubham Sandeep Rastogi	636de2bf34	Change isLittleEndian to follow llvm style and add an accessor Differential Revision: https://reviews.llvm.org/D134290	2022-09-20 17:00:47 -07:00
Scott Linder	f583151461	[NFC][AMDGPU] Refactor AMDGPUDisassembler Clean up ahead of a patch to fix bugs in the AMDGPUDisassembler. Use lit.local.cfg substitutions and more idiomatic use of split-file to simplify and extend existing kernel-descriptor disassembly tests. Add a comment to AMDHSAKernelDescriptor.h, as at least one small set towards keeping all kernel-descriptor sensitive code in sync. Reviewed By: kzhuravl, arsenm Differential Revision: https://reviews.llvm.org/D130105	2022-09-20 20:37:19 +00:00
Kazu Hirata	00874c48ea	[IPO] Reorder parameters of InlineFunction (NFC) With the recent addition of new parameter MergeAttributes (D134117), callers need to specify several default parameters before getting to specify the new parameter. This patch reorders the parameters so that callers do not have to specify as many default parameters. Differential Revision: https://reviews.llvm.org/D134125	2022-09-20 09:09:38 -07:00
Eric Li	403d72cd43	[Support][NFC] Clarify function comment Follow-up to `86118ec2` that addresses the comments in D134072, which were accidentally left off of the commit.	2022-09-20 11:10:16 -04:00
Eric Li	86118ec2d0	[Support] Provide access to the full mapping in llvm::Annotations Providing access to the mapping of annotations allows test helpers to be expressive by using the annotations as expectations. For example, a matcher could verify that all annotated points were matched by a matcher, or that an refactoring surgically modifies specific ranges. Differential Revision: https://reviews.llvm.org/D134072	2022-09-20 11:06:21 -04:00
Vy Nguyen	016c2f5e32	[lld-macho] Support -dyld_env This arg is undocumented but from looking at the code + experiment, it's used to add additional DYLD_ENVIRONMENT load commands to the output. Differential Revision: https://reviews.llvm.org/D134058	2022-09-20 10:16:45 -04:00
Caroline Concatto	d32b8fdbdb	[LLVM][AArch64] Replace aarch64.sve.ld by aarch64.sve.ldN.sret This patch removes the intrinsic aarch64.sve.ldN from tablegen in favour of using arch64.sve.ldN.sret. Depends on: D133023 Differential Revision: https://reviews.llvm.org/D133025	2022-09-20 13:15:07 +01:00
luxufan	bfd31c6f12	[MemorySSA][NFC] Use const whenever possible Differential Revision: https://reviews.llvm.org/D134162	2022-09-20 02:21:02 +00:00
Matt Arsenault	2adae8e1b7	VectorCombine: Pass through AssumptionCache	2022-09-19 19:25:22 -04:00
Matt Arsenault	ce44357216	Analysis: Add AssumptionCache to isSafeToSpeculativelyExecute Does not update any of the uses.	2022-09-19 19:25:22 -04:00
Matt Arsenault	34fb7803f8	GlobalISel: Pass through AssumptionCache	2022-09-19 19:10:51 -04:00
Matt Arsenault	bcb931c484	SelectionDAG: Add AssumptionCache analysis dependency Fixes compile time regression after `bb70b5d406`	2022-09-19 19:10:51 -04:00
Matt Arsenault	0d8ffcc532	Analysis: Add AssumptionCache argument to isDereferenceableAndAlignedPointer This does not try to pass it through from the end users.	2022-09-19 18:57:33 -04:00
Mircea Trofin	c625c17b88	[lld][thinlto] Include -mllvm options in the thinlto cache key They may modify thinlto optimization. This patch only extends support for `-mllvm`. There is another way to pass llvm flags, `-plugin-opt`, but its processing is different and will be provided in a subsequent patch. Differential Revision: https://reviews.llvm.org/D134013	2022-09-19 12:04:17 -07:00
Fangrui Song	0b140d0910	[Object] Add zstd decompression support to Decompressor llvm::object::Decompressor is used by many DWARF consumers like llvm-dwarfdump, llvm-dwp, llvm-symbolizer. Add tests to them. The lldb test can be left to D133530. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D134116	2022-09-19 11:41:16 -07:00
zhijian	dcd5abd4c4	[AIX] llvm-readobj support a new option --exception-section for xcoff object file. Summary: llvm-readobj support a new option --exception-section for xcoff object file. https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format#XCOFF__iua3i23ajbau Reviewers: James Henderson,Paul Scoropan Differential Revision: https://reviews.llvm.org/D133030	2022-09-19 10:55:48 -04:00
Nikita Popov	dd61726d5b	Revert "[SimplifyCFG] accumulate bonus insts cost" This reverts commit `e5581df60a`. This causes major compile-time regressions, about 2-3% end-to-end on CTMark.	2022-09-19 14:46:43 +02:00
Max Kazantsev	818b1ab84e	[SCEV][NFC] Remove unused parameter from forgetLoopDispositions Let's be honest about it, we don't drop loop dispositions for particular loops. Remove the parameter that misleadingly makes it apparent that we do.	2022-09-19 14:06:42 +07:00
Kazu Hirata	6b49f30fca	[llvm] Deprecate llvm::empty (NFC) This patch deprecates llvm::empty as I've migrated all known uses of llvm::empty(x) to x.empty(). Differential Revision: https://reviews.llvm.org/D134141	2022-09-18 22:01:32 -07:00
Kazu Hirata	2078350645	Use std::make_unsigned_t (NFC)	2022-09-18 18:41:02 -07:00
Lang Hames	0e43f3b04d	[ORC][ORC-RT] Make WrapperFunctionCall::Create support void functions. Serialized calls to void-wrapper-functions should have zero bytes of argument data, but accessing ArgData[0] may (and will, in the case of SmallVector) fail if the argument data buffer is empty. This commit fixes the issue by adding a check for empty argument buffers.	2022-09-18 17:53:45 -07:00
Yaxun (Sam) Liu	e5581df60a	[SimplifyCFG] accumulate bonus insts cost SimplifyCFG folds bool foo() { if (cond1) return false; if (cond2) return false; return true; } as bool foo() { if (cond1 \| cond2) return false return true; } 'cond2' is called 'bonus insts' in branch folding since they introduce overhead since the original CFG could do early exit but the folded CFG always executes them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into its predecessor BB which shares the destination. If it is below bonus-inst-threshold, SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed. When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts in the current BB to be considered for folding. This causes issue for unrolled loops which share destinations, e.g. bool foo(int a) { for (int i = 0; i < 32; i++) if (a[i] > 0) return false; return true; } After unrolling, it becomes bool foo(int a) { if(a[0]>0) return false if(a[1]>0) return false; //... if(a[31]>0) return false; return true; } SimplifyCFG will merge each BB with its predecessor BB, and ends up with 32 'bonus insts' which are always executed, which is much slower than the original CFG. The root cause is that SimplifyCFG does not consider the accumulated cost of 'bonus insts' which are folded from different BB's. This patch fixes that by introducing a ValueMap to track costs of 'bonus insts' coming from different BB's into the same BB, and cuts off if the accumulated cost exceeds a threshold. Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault Differential Revision: https://reviews.llvm.org/D132408	2022-09-18 20:21:14 -04:00
Kazu Hirata	82293ed486	[ModuleInliner] Remove unused using declarations (NFC)	2022-09-18 14:27:06 -07:00
Kazu Hirata	3e720fa9dc	Use std::decay_t (NFC)	2022-09-18 10:25:08 -07:00
Kazu Hirata	5e5a6c5b07	Use std::conditional_t (NFC)	2022-09-18 10:25:06 -07:00
Kazu Hirata	919b638eff	[ADT] Use std::common_type_t (NFC)	2022-09-18 10:25:04 -07:00
Kazu Hirata	d3b95ecc98	[ModuleInliner] Remove InlineOrder::front (NFC) InlineOrder::front is a remnant from the era when we had a nested "while" loops in the module inliner, with the inner one grouping the call sites with the same caller. Now that we have a simple "while" loop draining the priority queue, we can just use InlineOrder::pop. Differential Revision: https://reviews.llvm.org/D134121	2022-09-18 08:49:44 -07:00
Kazu Hirata	284f0397e2	[Transforms] Merge function attributes within InlineFunction (NFC) In the past, we've had a bug resulting in a compiler crash after forgetting to merge function attributes (D105729). This patch teaches InlineFunction to merge function attributes. This way, we minimize the "time" when the IR is valid, but the function attributes are not. Differential Revision: https://reviews.llvm.org/D134117	2022-09-17 23:10:23 -07:00
Kai Nacke	ae35188f97	[GISel] Fix match tree emitter. The following changes are necessasy to get the generated tree matcher to compile: - In CodeExpansions::declare(), the assert() prevents connecting two instructions. E.g. the match code (match (MUL $t, $s1, $s2), (SUB $d, $t, $s3)), results in two declarations of $t, one for the def and one for the use. Removing the assertion allows this construct. If $t is later used, it is one of the operands, which should be perfectly fine. - The code emitted in GIMatchTreeVRegDefPartitioner::generatePartitionSelectorCode() is not compilable: - The value of NewInstrID should be emitted, not the name - Both calls involving getOperand() end with one parenthesis too many - Swaps generated condition for the partition code in the latter function It also changes the rules i2p_to_p2i, fabs_fabs_fold, and fneg_fneg_fold to use the tree matcher for a linear match. These rules are tested by: CodeGen/AArch64/GlobalISel/combine-fabs.mir CodeGen/AArch64/GlobalISel/combine-fneg.mir CodeGen/AArch64/GlobalISel/combine-ptrtoint.mir CodeGen/AMDGPU/GlobalISel/combine-add-nullptr.mir Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D133257	2022-09-18 00:00:15 +00:00
Aiden Grossman	e5e3dccd07	[mlgo] Add in-development instruction based features for regalloc advisor This patch adds in instruction based features to the regalloc advisor gated behind a flag so a user can decide at runtime whether or not they want to enable the feature. The features are only enabled when LLVM is compiled in MLGO develpment mode (LLVM_HAVE_TF_API) is set to true. To extract the instruction features, I'm taking a list of segments from each LiveInterval and noting the start and end SlotIndices. This list is then sorted based on the start SlotIndex and I iterate through each SlotIndex to grab instructions, making sure to check for overlaps. This results in a vector of opcodes and binary mapping matrix that maps live ranges to the opcodes of the instructions within that LR. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D131930	2022-09-17 19:54:45 +00:00
Kazu Hirata	20d764aff0	[llvm] Don't including SetVector.h (NFC) llvm/lib/ProfileData/RawMemProfReader.cpp uses SetVector without including SetVector.h, so this patch adds an appropriate #include there.	2022-09-17 12:36:43 -07:00
Fangrui Song	367997d0d6	[Support] Rename llvm::compression::{zlib,zstd}::uncompress to more appropriate decompress This improves consistency with other places (e.g. llvm::compression::decompress, llvm::object::Decompressor::decompress, llvm-objcopy). Note: when zstd::uncompress was added, we noticed that the API `ZSTD_decompress` is fine while the zlib API `uncompress` is a misnomer.	2022-09-17 12:35:17 -07:00
Kazu Hirata	6170437df5	[ModuleInliner] Remove unnecessary #includes (NFC) While I am at it, this patch removes an unnecessary forward declaration.	2022-09-17 12:05:35 -07:00
Kazu Hirata	5faf4bf195	[ModuleInliner] Move UseInlinePriority to InlineOrder.cpp (NFC) UseInlinePriority specifies the priority function. This patch simplifies the code by moving UseInlinePriority closer to the actual consumer -- the switch statement inside getInlineOrder. Differential Revision: https://reviews.llvm.org/D134100	2022-09-17 11:41:28 -07:00
Sander de Smalen	bed214cf0f	[AArch64][SME] Add intrinsics for enabling/disabling ZA. This adds the intrinsics: * void @llvm.aarch64.sme.za.enable() -> smstart za * void @llvm.aarch64.sme.za.disable() -> smstop za Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D133894	2022-09-17 16:41:42 +00:00
Shivam Gupta	78533528cf	[NFC] Fix a comment in InitializePasses.h	2022-09-17 19:09:26 +05:30
Kazu Hirata	29c841ce93	Revert "[llvm] Remove llvm::is_trivially_{copy/move}_constructible (NFC)" This reverts commit `01ffe31cbb`. A build breakage with GCC 7.3 has been reported: https://reviews.llvm.org/D132311#3797053 FWIW, GCC 7.5 is OK according to Pavel Chupin. I also personally tested GCC 8.4.0.	2022-09-16 18:26:20 -07:00
Kazu Hirata	6e30a9cc08	[Inliner] Retire DefaultInlineOrder (NFC) DefaultInlineOrder was largely an exercise in generalizing the traversal order of call sites within the inliner. Now that the module inliner is starting to form its shape, there is no point in sharing DefaultInlineOrder between the module inliner and the CGSCC inliner. DefaultInlineOrder and all the other inline orders are mutually exclusive in the following sense: - The use of DefaultInlineOrder doesn't make sense in the module inliner because there is no priority inherent in the order in which call sites are added to the list of call sites -- SmallVector. - The use of any other inline order doesn't make sense in the CGSCC inliner because little prioritization can be done within one CGSCC. This patch essentially reverts the addition of DefaultInlineOrder so that the loop structure of Inliner.cpp looks like the state just before we started working on the module inliner (circa June 2021). At the same time, ww remove the choice of DefaultInlineOrder from UseInlinePriority. Differential Revision: https://reviews.llvm.org/D134080	2022-09-16 15:36:40 -07:00
Jessica Paquette	1076b31da8	[GlobalISel] Combine select + fcmp to fminnum/fmaxnum/fminimum/fmaximum This is a partial port of the code used by the SelectionDAGBuilder to translate selects. In particular, see matchSelectPattern in ValueTracking.cpp. This is a GISel-equivalent of the portion which handles fminnum/fmaxnum/fminimum/fmaximum. I tried to set it up so it'd be easy to add the non-FP cases. Those are simpler. On the AArch64-end, it seems like the FP cases are more important for perf right now, so I bit the bullet and went at the more complicated problem. :) I elected to do this as a post-legalize combine rather than in the IRTranslator because Deciding which fmax/fmin to use can depend on legalization rules Philosophically-speaking (TM), putting it in a combine just feels cleaner Being able to enable/disable the combine is handy Another option would be to use the ValueTracking code in the IRTranslator and match what SelectionDAGBuilder::visitSelect does. I think that may be somewhat annoying since we'd need to write lowerings back into the selects in the legalizer. I'm not strongly opposed to the approach. We'd also want to be careful with vector selects once that's implemented, which explicitly check if a vector select is legal on the target. That'd probably need a hook. From what I can tell, doing this as a combine is probably a cleaner option long-term. Differential Revision: https://reviews.llvm.org/D116702	2022-09-16 13:35:46 -07:00
David Majnemer	8a868d8859	Revert "Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView"" This reverts commit `cd20a18286` and adds a "let Heading" to NoStackProtectorDocs.	2022-09-16 19:39:48 +00:00
Kazu Hirata	e0bc76eb23	[ModuleInliner] Move InlinePriority and its derived classes to InlineOrder.cpp (NFC) These classes are referred to only from getInlineOrder in InlineOrder.cpp. This patch hides the entire class declarations and definitions in InlineOrder.cpp. Differential Revision: https://reviews.llvm.org/D134056	2022-09-16 12:32:16 -07:00
Justin Lebar	8cc3bfd13f	[NFC] Fix indentation in ValueTracking.h. In a separate patch I want to modify ValueTracking.h. When I touch the header, arc wants to clang-format the lines I touch (reasonable!). But then these whitespace changes get mixed into my patch.	2022-09-16 10:46:23 -07:00
mbs	7061a3f3f8	[support] Prepare TimeProfiler for cross-thread support This NFC prepares the TimeProfiler to support the construction and completion of time profiling 'entries' across threads. Add ClockType alias so we can change the clock in one place. (trivial) Use c++ usings instead of typedefs Rename Entry to TimeTraceProfilerEntry since this type will eventually become public. Add an intro comment. Add some smoke unit tests. Reviewed By: russell.gallop, rriddle, lattner, jloser Differential Revision: https://reviews.llvm.org/D133153	2022-09-16 10:20:18 -06:00
Yuta Mukai	116838b151	[MachinePipeliner] Fix the interpretation of the scheduling model The method of counting resource consumption is modified to be based on "Cycles" value when DFA is not used. The calculation of ResMII is modified to total "Cycles" and divide it by the number of units for each resource. Previously, ResMII was excessive because it was assumed that resources were consumed for the cycles of "Latency" value. The method of resource reservation is modified similarly. When a value of "Cycles" is larger than 1, the resource is considered to be consumed by 1 for cycles of its length from the scheduled cycle. To realize this, ResourceManager maintains a resource table for all slots. Previously, resource consumption was always 1 for 1 cycle regardless of the value of "Cycles" or "Latency". In addition, the number of micro operations per cycle is modified to be constrained by "IssueWidth". To disable the constraint, --pipeliner-force-issue-width=100 can be used. For the case of using DFA, the scheduling results are unchanged. Reviewed By: dpenry Differential Revision: https://reviews.llvm.org/D133572	2022-09-16 09:51:48 +09:00
Navid Emamdoost	3e52c0926c	Add -fsanitizer-coverage=control-flow Reviewed By: kcc, vitalybuka, MaskRay Differential Revision: https://reviews.llvm.org/D133157	2022-09-15 15:56:04 -07:00
Alexander Timofeev	fbdea5a2e9	[AMDGPU] Always select s_cselect_b32 for uniform 'select' SDNode This patch contains changes necessary to carry physical condition register (SCC) dependencies through the SDNode scheduler. It adds the edge in the SDNodeScheduler dependency graph instead of inserting the SCC copy between each definition and use. This approach lets the scheduler place instructions in an optimal way placing the copy only when the dependency cannot be resolved. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D133593	2022-09-15 22:03:56 +02:00
Florian Hahn	81a11da762	[CGP,AArch64] Replace zexts with shuffle that can be lowered using tbl. This patch extends CodeGenPrepare to lower zext v16i8 -> v16i32 in loops using a wide shuffle creating a v64i8 vector, selecting groups of 3 zero elements and an element from the input. This is profitable on AArch64 where such shuffles can be lowered to tbl instructions, but only in loops, because it requires materializing 4 masks, which can be done in the loop preheader. This is the only reason the transform is part of CGP. If there's a better alternative I missed, please let me know. The same goes for the shouldReplaceZExtWithShuffle hook which guards this. I am not sure if this transform will be beneficial on other targets, but it seems like there is no way other convenient way. This improves the generated code for loops like the one below in combination with D96522. int foo(uint8_t p, int N) { unsigned long long sum = 0; for (int i = 0; i < N ; i++, p++) { unsigned int v = p; sum += (v < 127) ? v : 256 - v; } return sum; } https://clang.godbolt.org/z/Wco866MjY Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D120571	2022-09-15 19:18:13 +01:00
Sergei Barannikov	c6acb4eb0f	[SDAG] Add `getCALLSEQ_END` overload taking `uint64_t`s All in-tree targets pass pointer-sized ConstantSDNodes to the method. This overload reduced amount of boilerplate code a bit. This also makes getCALLSEQ_END consistent with getCALLSEQ_START, which already takes uint64_ts.	2022-09-15 14:02:12 -04:00
Jay Foad	3822a01e0b	[AMDGPU] Add GFX11 ds_bvh_stack_rtn_b32 instruction Differential Revision: https://reviews.llvm.org/D133928	2022-09-15 16:46:14 +01:00
Sander de Smalen	45d28779c5	[AArch64][SME] Fix lowering of llvm.aarch64.get.pstatesm() A thread may not have access to SME or TPIDR2_EL0, so in order to safely query PSTATE.SM in a streaming-compatible function, the code should call `__arm_sme_state()`, as described in the ABI: `c2bb09c4d4` This means that the value of pstate.sm is: * 0 if the function is non-streaming. * 1 if the function has `arm_streaming` or `arm_locally_streaming`. * evaluated at runtime by a call to __arm_sme_state() otherwise. This patch also adds a calling convention for calls to SME support routines. At some point we can remove the need for the llvm.aarch64.get.pstatesm() intrinsic and use function calls (with the corresponding cc) directly instead. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131571	2022-09-15 15:14:13 +00:00
Matt Arsenault	63d1d37d35	RegAllocGreedy: Avoid overflowing priority bitfields The class priority is expected to be at most 5 bits before it starts clobbering bits used for other fields. Also clamp the instruction distance in case we have millions of instructions. AMDGPU was accidentally overflowing into the global priority bit in some cases. I think in principal we would have wanted this, but in the cases I've looked at, it had the counter intuitive effect and de-prioritized the large register tuple. Avoid using weird bit hack PPC uses for global priority. The AllocationPriority field is really 5 bits, and PPC was relying on overflowing this to 6-bits to forcibly set the global priority bit. Split this out as a separate flag to avoid having magic behavior for values above 31.	2022-09-15 10:38:40 -04:00
Alexey Lapshin	adabfb5e32	[DWARFLinker][NFC] Set the target DWARF version explicitly. Currently, DWARFLinker determines the target DWARF version internally. It examines incoming object files, detects maximal DWARF version and uses that version for the output file. This patch allows explicitly setting output DWARF version by the consumer of DWARFLinker. So that DWARFLinker uses a specified version instead of autodetected one. It allows consumers to use different logic for setting the target DWARF version. f.e. instead of the maximally used version someone could set a higher version to convert from DWARFv4 to DWARFv5 (This possibility is not supported yet, but it would be good if the interface will support it). Or another variant is to set the target version through the command line. In this patch, the autodetection is moved into the consumers(DwarfLinkerForBinary.cpp, DebugInfoLinker.cpp). Differential Revision: https://reviews.llvm.org/D132755	2022-09-15 16:06:10 +03:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `7539e9cf81`.	2022-09-15 03:08:46 +00:00
Vitaly Buka	72b776168c	[IRBuilder] Add CreateMaskedExpandLoad and CreateMaskedCompressStore	2022-09-14 19:18:52 -07:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Sam Clegg	8273ca1421	[MC] Fix typo in getSectionAddressSize comment. NFC The comment was refering to a now non-existant function that was removed in `93e3cf0ebd`. Differential Revision: https://reviews.llvm.org/D133098	2022-09-14 15:15:41 -07:00
Craig Topper	50a699e362	[IR][VP] Remove IntrArgMemOnly from vp.gather/scatter. IntrArgMemOnly is only valid for intrinsics that use a scalar pointer argument. These intrinsics use a vector of pointer. Alias analysis will try to find a scalar pointer argument and will return incorrect alias results when it doesn't find one. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133898	2022-09-14 15:00:07 -07:00
Arthur Eubanks	ccc9107ad6	[OptBisect] Add flag to print IR when opt-bisect kicks in -opt-bisect-print-ir-path=foo will dump the IR to foo when opt-bisect-limit starts skipping passes. Currently we don't print the IR if the opt-bisect-limit is higher than the total number of times opt-bisect is called. This makes getting the IR right before a bad transform easier. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D133809	2022-09-14 13:48:03 -07:00
Fangrui Song	25394c9d10	[llvm-objdump] Change printSymbolVersionDependency to use ELFFile API When .gnu.version_r is empty (allowed by readelf but warned by objdump), llvm-objdump -p may decode the next section as .gnu.version_r and may crash due to out-of-bounds C string reference. ELFFile<ELFT>::getVersionDependencies handles 0-entry .gnu.version_r gracefully. Just use it. Fix https://github.com/llvm/llvm-project/issues/57707 Differential Revision: https://reviews.llvm.org/D133751	2022-09-14 12:30:34 -07:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Martin Storsjö	e280940bfb	[Support] Access threadIndex via a wrapper function On Unix platforms, this wrapper function is inline, so it should expand to the same direct access to the thread local variable. On Windows, it's a non-inline function within Parallel.cpp, allowing making the thread_local variable static. Windows Native TLS doesn't support direct access to thread local variables in a different DLL, and GCC/binutils on Windows occasionally has problems with non-static thread local variables too. This fixes mingw dylib builds with native TLS after `e6aebff674`. At the same time, move the whole thread local variable within #if LLVM_ENABLE_THREADS to fix builds without threading support. Differential Revision: https://reviews.llvm.org/D133759	2022-09-14 09:19:27 +03:00
Pengxuan Zheng	ecb5ea6a26	[Object][COFF] Allow section symbol to be common symbol I ran into an lld-link error due to a symbol named ".idata$4" coming from some static library: .idata$4 should not refer to special section 0. Here is the symbol table entry for .idata$4: Symbol { Name: .idata$4 Value: 3221225536 Section: IMAGE_SYM_UNDEFINED (0) BaseType: Null (0x0) ComplexType: Null (0x0) StorageClass: Section (0x68) AuxSymbolCount: 0 } The symbol .idata$4 is a section symbol (IMAGE_SYM_CLASS_SECTION) and LLD currently handles it as a regular defined symbol since isCommon() returns false for this symbol. This results in the error ".idata$4 should not refer to special section 0" because lld-link asserts that regular defined symbols should not refer to section 0. Should this symbol be handled as a common symbol instead? LLVM currently only allows external symbols (IMAGE_SYM_CLASS_EXTERNAL) to be common symbols. However, the PE/COFF spec (see section "Section Number Values") does not seem to mention this restriction. Any thoughts? Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D133627	2022-09-13 18:07:02 -07:00
YongKang Zhu	5fa6b24354	Address feedback in https://reviews.llvm.org/D133637 https://reviews.llvm.org/D133637 fixes the problem where we should hash raw content of register mask instead of the pointer to it. Fix the same issue in `llvm::hash_value()`. Remove the added API `MachineOperand::getRegMaskSize()` to avoid potential confusion. Add an assert to emphasize that we probably should hash a machine operand iff it has associated machine function, but keep the fallback logic in the original change. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D133747	2022-09-13 16:12:41 -07:00
Chris Bieneman	4b96f8996a	[DX] DXContainer does not support COMDAT The DXContainer is pretty primitive, but doesn't support COMDAT. We need to set that in the Triple so that Clang won't try to emit COMDATs.	2022-09-13 13:59:47 -05:00
theidexisted	0a1c8522f3	[NFC][ADT] Fix assert message Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D129632	2022-09-13 18:55:57 +00:00
Hendrik Greving	393a17b5d1	[ValueTypes] Define MVTs for v256i2/v128i4. Adds MVT::v256i2, MVT::v128i4. Differential Revision: https://reviews.llvm.org/D133603	2022-09-13 09:02:23 -07:00
Pavel Samolysov	02aaf8e3d6	[NFC][ScheduleDAGInstrs] Use structure bindings and emplace_back Some uses of std::make_pair and the std::pair's first/second members in the ScheduleDAGInstrs.[cpp\|h] files were replaced with using of the vector's emplace_back along with structure bindings from C++17.	2022-09-13 12:49:04 +03:00
Sylvestre Ledru	cd20a18286	Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView" Causing: https://github.com/llvm/llvm-project/issues/57709 This reverts commit `ab56719acd`.	2022-09-13 10:53:59 +02:00
Max Kazantsev	86d5586d78	[SCEVExpander] Recompute poison-generating flags on hoisting. PR57187 Instruction being hoisted could have nuw/nsw flags inferred from the old context, and we cannot simply move it to the new location keeping them because we are going to introduce new uses to them that didn't exist before. Example in https://github.com/llvm/llvm-project/issues/57187 shows how this can produce branch by poison from initially well-defined program. This patch forcefully recomputes poison-generating flag in the new context. Differential Revision: https://reviews.llvm.org/D132022 Reviewed By: fhahn, nikic	2022-09-13 12:56:35 +07:00
Aiden Grossman	eec183c171	[nfc] Refactor SlotIndex::getInstrDistance to better reflect actual functionality This patch refactors SlotIndex::getInstrDistance to SlotIndex::getApproxInstrDistance to better describe the actual functionality of this function. This patch also adds in some additional comments better documenting the assumptions that this function makes to increase clarity. Based on discussion on the LLVM Discourse: https://discourse.llvm.org/t/odd-behavior-in-slotindex-getinstrdistance/64934/5 Reviewed By: mtrofin, foad Differential Revision: https://reviews.llvm.org/D133386	2022-09-12 23:33:35 +00:00
Amara Emerson	25bcc8c797	[GlobalISel][Legalizer] Fix minScalarEltSameAsIf to handle p0 element types. The mutation the action generates tries to change the input type into the element type of larger vector type. This doesn't work if the larger element type is a vector of pointers since it creates an illegal mutation between scalar and pointer types. Differential Revision: https://reviews.llvm.org/D133671	2022-09-13 00:01:37 +01:00
David Majnemer	ab56719acd	[clang, llvm] Add __declspec(safebuffers), support it in CodeView __declspec(safebuffers) is equivalent to __attribute__((no_stack_protector)). This information is recorded in CodeView. While we are here, add support for strict_gs_check.	2022-09-12 21:15:34 +00:00
Kazu Hirata	9606608474	[llvm] Use x.empty() instead of llvm::empty(x) (NFC) I'm planning to deprecate and eventually remove llvm::empty. I thought about replacing llvm::empty(x) with std::empty(x), but it turns out that all uses can be converted to x.empty(). That is, no use requires the ability of std::empty to accept C arrays and std::initializer_list. Differential Revision: https://reviews.llvm.org/D133677	2022-09-12 13:34:35 -07:00
YongKang Zhu	481a32f587	Bug fix on stable hash calculation for machine operands RegisterMask and RegisterLiveOut MachineOperand::getRegMask() returns a pointer to register mask. We should hash the raw content of register mask instead of its pointer. Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D133637	2022-09-12 13:25:04 -07:00
Fangrui Song	e6aebff674	[ELF] Parallelize relocation scanning * Change `Symbol::flags` to a `std::atomic<uint16_t>` * Add `llvm::parallel::threadIndex` as a thread-local non-negative integer * Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex * Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output. MIPS and PPC64 use global states for relocation scanning. Keep serial scanning. Speed-up with mimalloc and --threads=8 on an Intel Skylake machine: * clang (Release): 1.27x as fast * clang (Debug): 1.06x as fast * chrome (default): 1.05x as fast * scylladb (default): 1.04x as fast Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64): * clang (Release): 1.31x as fast * scylladb (default): 1.06x as fast Reviewed By: andrewng Differential Revision: https://reviews.llvm.org/D133003	2022-09-12 12:56:35 -07:00
Craig Topper	38ffa2bb96	[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. For remainder: If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer type, this urem will be expand by DAGCombiner using multiply by magic constant. We do have to take into account that adding high and low together can produce a carry, making it a (BitWidth / 2)+1 bit number. So we need to also add back in the carry from the first addition. For division: We can use the above trick to compute the remainder, subtract that remainder from the dividend, then multiply by the multiplicative inverse of the Divisor modulo (1 << BitWidth). This is based on the section "Remainder by Summing Digits" in Hacker's delight. The remainder trick is similar to a trick you may have learned for determining if a decimal number is divisible by 3. You can add all the digits together and see if the sum is divisible by 3. If you're not sure if the sum is divisible by 3, you can add its digits together. This can be repeated until you have a single decimal digit. If that digit is 3, 6, or 9, then the original number is divisible by 3. This works because 10 % 3 == 1. gcc already does this same trick. There are additional tricks gcc does urem as well as srem, udiv, and sdiv that I plan to add in future patches. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130862	2022-09-12 10:34:52 -07:00
Matthias Gehre	c1502425ba	Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth Also remove new-pass-manager version of ExpandLargeDivRem because there is no way yet to access TargetLowering in the new pass manager. Differential Revision: https://reviews.llvm.org/D133691	2022-09-12 17:06:16 +01:00
Jay Foad	210e6a993d	[GlobalISel] Simplify extended add/sub to add/sub with carry Simplify extended add/sub (with carry-in and carry-out) to add/sub with carry (with carry-out only) if carry-in is known to be zero. Differential Revision: https://reviews.llvm.org/D133702	2022-09-12 17:05:44 +01:00
Alexey Bataev	dfe1e9dd79	[SLP]Improve reordering of clustered reused scalars. If the reused scalars are clustered, i.e. each part of the reused mask contains all elements of the original scalars exactly once, we can reorder those clusters to improve the whole ordering of of the clustered vectors. Differential Revision: https://reviews.llvm.org/D133524	2022-09-12 06:52:25 -07:00
Matt Arsenault	7834194837	TableGen: Introduce generated getSubRegisterClass function Currently there isn't a generic way to get a smaller register class that can be produced from a subregister of a larger class. Replaces a manually implemented version for AMDGPU. This will be used to improve subregister support in the allocator.	2022-09-12 09:03:37 -04:00
Matt Arsenault	bb70b5d406	CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer Previously this was assuming piontsToConstantMemory implies dereferenceable.	2022-09-12 08:38:35 -04:00
David Spickett	739b69e655	[LLVM][AArch64] Explain that X19 is used as the frame base pointer register Fixes #50098 LLVM uses X19 as the frame base pointer, if it needs to. Meaning you can get warnings if you clobber that with inline asm. However, it doesn't explain why. The frame base register is not part of the ABI so it's pretty confusing why you get that warning out of the blue. This adds a method to explain a reserved register with X19 as the first one. The logic is the same as getReservedRegs. I could have added a return parameter to isASMClobberable and friends but found that there's a lot of things that call isReservedReg in various ways. So while one more method on the pile isn't great design, it is simpler right now to do it this way and only pay the cost if you are actually using a reserved register. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D133213	2022-09-12 09:18:09 +00:00
Johannes Doerfert	c922cac868	Revert "[Attributor] AAPointerInfo should allow "harmless" uses" Revert "[Attributor] Teach AAPointerInfo to look into aggregates" This reverts commit `844f6c5d03` and `4ed0a88cd8` as they broke the buildbots that run openmp/libomptarget/test/offloading/bug49021.cpp.	2022-09-11 21:37:54 -07:00
Johannes Doerfert	4ed0a88cd8	[Attributor] Teach AAPointerInfo to look into aggregates If we have a constant aggregate, e.g., as an initializer, we usually failed to extract the proper value/type from it. This patch provides the size and offset information necessary to extract the right part of the constant.	2022-09-11 20:16:11 -07:00
Johannes Doerfert	21711039e3	[OpenMP] Allow the Attributor to look at functions we also internalized This is important as we have accesses to globals in those which we need to categorize.	2022-09-11 20:16:11 -07:00
Kazu Hirata	a21c8be1cc	[llvm] Use std::aligned_storage_t (NFC)	2022-09-11 16:11:39 -07:00
Junduo Dong	6975ab7126	[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework The previous implementation of time tracing in NewPassManager is direct but messive. The key codes are like the demo below: ``` /// Runs the function pass across every function in the module. PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR) { /// ... PreservedAnalyses PassPA; { TimeTraceScope TimeScope(Pass.name()); PassPA = Pass.run(F, FAM); } /// ... } ``` It can be bothered to judge where should we add the tracing codes by hands. With the PassInstrumentation framework, we can easily add `Before/After` callback functions to add time tracing codes. Differential Revision: https://reviews.llvm.org/D131960	2022-09-11 05:42:55 -07:00
sunho	d1c4d96126	[ORC][ORC_RT][COFF] Remove public bootstrap method. Removes public bootstrap method that is not really necessary and not consistent with other platform API. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D132780	2022-09-10 15:25:50 +09:00
sunho	73c4033987	[ORC][ORC_RT][COFF] Support dynamic VC runtime. Supports dynamic VC runtime. It implements atexits handling which is required to load msvcrt.lib successfully. (the object file containing atexit symbol somehow resolves to static vc runtim symbols) It also default to dynamic vc runtime which tends to be more robust. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D132525	2022-09-10 15:25:49 +09:00
Joe Loser	62b8a61d6c	[llvm] Remove includes of `llvm/Support/STLArrayExtras.h` `llvm` and downstream internal callers no longer use `array_lengthof`, so drop the include everywhere. Differential Revision: https://reviews.llvm.org/D133600	2022-09-09 17:44:00 -06:00
Fangrui Song	058f17d3af	[ADT] Move LLVM_DEPRECATED before type after D133502 `[[deprecated(...)]]` cannot appear between `inline size_t`.	2022-09-09 15:56:58 -07:00
Joe Loser	5758c824da	[ADT] Mark `llvm::array_lengthof` as deprecated As a follow-up of `5e96cea1db`, mark `llvm::array_lengthof` as deprecated in favor of using `std::size` function directly. Differential Revision: https://reviews.llvm.org/D133502	2022-09-09 15:31:00 -06:00
Augie Fackler	4fea8ee540	OpenMP: mark allocptr attribute on __kmpc_free_shared Differential Revision: https://reviews.llvm.org/D124491	2022-09-09 14:09:18 -04:00
Nikita Popov	a9f312c7f4	[AST] Use BatchAA in aliasesUnknownInst() (NFCI)	2022-09-09 15:54:48 +02:00
Sebastian Neubauer	c7750c522e	Add helper func to get first non-alloca position The LLVM performance tips suggest that allocas should be placed at the beginning of the entry block. So far, llvm doesn’t provide any helper to find that position. Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*) that get an insert position after the (static) allocas at the start of a function and use it in ShadowStackGCLowering. Differential Revision: https://reviews.llvm.org/D132554	2022-09-09 15:39:53 +02:00
Namhyung Kim	43efb5e445	[llvm-objdump] Create name for fake sections It doesn't have a section header string table so add a vector to have the strings and create name based on the program header type and the index. Differential Revision: https://reviews.llvm.org/D131290	2022-09-09 12:27:07 +01:00
Serge Pavlov	7b9fae05b4	[Clang] Use virtual FS in processing config files Clang has support of virtual file system for the purpose of testing, but treatment of config files did not use it. This change enables VFS in it as well. Differential Revision: https://reviews.llvm.org/D132867	2022-09-09 18:24:45 +07:00
Serge Pavlov	55e1441f7b	Revert "[Clang] Use virtual FS in processing config files" This reverts commit `9424497e43`. Some buildbots failed, reverted for investigation.	2022-09-09 16:43:15 +07:00
Serge Pavlov	9424497e43	[Clang] Use virtual FS in processing config files Clang has support of virtual file system for the purpose of testing, but treatment of config files did not use it. This change enables VFS in it as well. Differential Revision: https://reviews.llvm.org/D132867	2022-09-09 16:28:51 +07:00
Fangrui Song	781dea021a	[Support] Rename DebugCompressionType::Z to Zlib "Z" was so named when we had both gABI ELFCOMPRESS_ZLIB and the legacy .zdebug support. Now we have just one zlib format, we should use the more descriptive name.	2022-09-08 16:11:29 -07:00
raghavmedicherla	5d3cf8267f	Revert "Support: Add mapped_file_region::sync(), equivalent to msync" This reverts commit `142f51fc2f`. This shouldn't be committed, it got committed accidentally.	2022-09-08 12:49:52 -04:00
Thomas Lively	ac3b8df8f2	[WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32` As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16 type, use u16 to represent the bfloats in the builtin function arguments. Differential Revision: https://reviews.llvm.org/D133428	2022-09-08 08:07:49 -07:00
Joe Loser	5e96cea1db	[llvm] Use std::size instead of llvm::array_lengthof LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays. Change call sites to use `std::size` instead. Differential Revision: https://reviews.llvm.org/D133429	2022-09-08 09:01:53 -06:00
Eric Wang	d8a2d3f7d4	[NFC][Regalloc] Introduce the RegAllocPriorityAdvisorAnalysis This patch introduces the priority analysis and the priority advisor, the default implementation, and the scaffolding for introducing the other implementations of the advisor. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D132835	2022-09-08 07:50:03 -07:00
David Spickett	e428baf001	[LLVM][ARM] Remove options for armv2, 2A, 3 and 3M Fixes #57486 These pre v4 architectures are not specifically supported by codegen. As demonstrated in the linked issue. GCC has not supported 3M since GCC 9 and presumably 2 and 2A earlier than that. So we are aligned in that sense. (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add) This removes the options and associated testing. The Pre_v4 build attribute remains mainly because its absence would be more confusing. It will not be used other than to complete the list of build attributes as shown in the ABI. https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes Reviewed By: nickdesaulniers, peter.smith, rengolin Differential Revision: https://reviews.llvm.org/D133109	2022-09-08 09:49:48 +00:00
Nikita Popov	96cb7c2273	[ConstantExpr] Remove fneg expression As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179, this removes the fneg constant expression (which is, incidentally, the only unary operator expression). Differential Revision: https://reviews.llvm.org/D133418	2022-09-08 10:24:55 +02:00
Fangrui Song	b6e1fd761d	[llvm-objcopy] Support --{,de}compress-debug-sections for zstd Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal: https://groups.google.com/g/generic-abi/c/satyPkuMisk ("Add new ch_type value: ELFCOMPRESS_ZSTD") Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D130458	2022-09-08 00:59:14 -07:00
Fangrui Song	a41977dd0f	[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress} as high-level API on top of `llvm::compression::{zlib,zstd}::`: getReasonIfUnsupported: return nullptr if the specified format is supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...` * compress: dispatch to zlib::uncompress or zstd::uncompress * decompress: dispatch to zlib::uncompress or zstd::uncompress Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic dependency. There are 40+ uses in llvm-project. Add another enum class `llvm::compression::Format` to represent supported compression formats, which may be a superset of ELF compression formats. See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use case. Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") --- Note: this patch alone will cause -Wswitch to llvm/lib/ObjCopy/ELF/ELFObject.cpp Reviewed By: ckissane, dblaikie Differential Revision: https://reviews.llvm.org/D130506	2022-09-08 00:58:55 -07:00
Nikita Popov	0444b40ed3	Revert "[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}" This reverts commit `19dc3cff0f`. This reverts commit `5b19a1f8e8`. This reverts commit `9397648ac8`. This reverts commit `10842b4475`. Breaks the GCC build, as reported here: https://reviews.llvm.org/D130506#3776415	2022-09-08 09:33:12 +02:00
Fangrui Song	10842b4475	[Support] Work around GCC's enum support	2022-09-08 00:13:25 -07:00
Fangrui Song	5b19a1f8e8	[llvm-objcopy] Support --{,de}compress-debug-sections for zstd Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal: https://groups.google.com/g/generic-abi/c/satyPkuMisk ("Add new ch_type value: ELFCOMPRESS_ZSTD") Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D130458	2022-09-07 23:53:40 -07:00
Fangrui Song	19dc3cff0f	[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress} as high-level API on top of `llvm::compression::{zlib,zstd}::`: getReasonIfUnsupported: return nullptr if the specified format is supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...` * compress: dispatch to zlib::uncompress or zstd::uncompress * decompress: dispatch to zlib::uncompress or zstd::uncompress Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic dependency. There are 40+ uses in llvm-project. Add another enum class `llvm::compression::Format` to represent supported compression formats, which may be a superset of ELF compression formats. See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use case. Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Differential Revision: https://reviews.llvm.org/D130506	2022-09-07 23:53:14 -07:00
gonglingqin	d5f7a2182d	[LoongArch] Add codegen support for atomicrmw xchg operation on LA32 Depends on D131228 Differential Revision: https://reviews.llvm.org/D131229	2022-09-08 13:57:53 +08:00
gonglingqin	b60f801607	[LoongArch] Add codegen support for atomicrmw xchg operation on LA64 In order to avoid the patch being too large, the atomicrmw xchg operation on LA32 will be added later Differential Revision: https://reviews.llvm.org/D131228	2022-09-08 13:57:26 +08:00
Fangrui Song	f48931f3a8	[NewPM] Switch -filter-passes from ClassName to pass-name NewPM -filter-passes (D86360) uses ClassName instead of pass-name as used in `-passes`, `-print-after`, etc. D87216 has added a mechanism to map ClassName to pass-name. Adopt it for -filter-passes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D133263	2022-09-07 22:02:26 -07:00
Marco Elver	97c2220565	[SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass Introduces the SanitizerBinaryMetadata instrumentation pass which uses the new MD_pcsections metadata kinds to instrument certain types of instructions and functions required for breakpoint-based sanitizers. The first intended user of the binary metadata emitted will be a variant of GWP-TSan [1]. GWP-TSan will require information about atomic accesses; to unambiguously determine if an access is atomic or not, we also require "covered" information which code has been compiled with SanitizerBinaryMetadata instrumentation enabled. [1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D130887	2022-09-07 21:25:40 +02:00
Andrea Di Biagio	3262794804	[MCA] Correctly check pipeline availability for partially overlapping resource groups. This patch mostly reverts commit `70b37f4c03` which fixed PR50725. In case of explicit consumption of multiple partially overlapping group resources, the ResourceManager was not correctly checking pipeline esources availability. The fix for PR50725 only partially addressed a few instances of that issue. This is a more general (although, technically slower) fix for that same issue. It also fixes Issue #57548 Thanks to Haohai Wen for the small reproducible.	2022-09-07 12:17:59 +01:00
Marco Elver	343700358f	[AsmPrinter] Emit PCs into requested PCSections Interpret MD_pcsections in AsmPrinter emitting the requested metadata to the associated sections. Functions and normal instructions are handled. Differential Revision: https://reviews.llvm.org/D130879	2022-09-07 11:36:02 +02:00
Marco Elver	31a548021b	[GlobalISel] Propagate PCSections metadata to MachineInstr Propagate (most) PC sections metadata to MachineInstr when GlobalISel is doing instruction selection. This change results in support for architectures using GlobalISel (such as -O0 with AArch64). Not all instructions may be supported yet, and requires further target-specific handling (such as done for AArch64 pseudo-atomics). Expanding supported instructions is planned on a case-by-case basis and new use cases for PC sections metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130886	2022-09-07 11:36:02 +02:00
Marco Elver	0ba8886af5	[FastISel] Propagate PCSections metadata to MachineInstr Propagate PC sections metadata to MachineInstr when FastISel is doing instruction selection. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130884	2022-09-07 11:36:01 +02:00
Nikita Popov	98a3a340c3	[ConstantExpr] Don't create fneg expressions Don't create fneg expressions unless explicitly requested by IR or bitcode.	2022-09-07 11:27:25 +02:00
Marco Elver	da695de628	[MachineInstrBuilder] Introduce MIMetadata to simplify metadata propagation In many places DebugLoc and PCSections metadata are just copied along to propagate them through MachineInstrs. Simplify doing so by bundling them up in a MIMetadata class that replaces the DebugLoc argument to most BuildMI() variants. The DebugLoc-only constructors allow implicit construction, so that existing usage of `BuildMI(.., DL, ..)` works as before, and the rest of the codebase using BuildMI() does not require changes. NFC. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130883	2022-09-07 11:22:50 +02:00
Marco Elver	4c58b00801	[SelectionDAG] Propagate PCSections through SDNodes Add a new entry to SDNodeExtraInfo to propagate PCSections through SelectionDAG. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130882	2022-09-07 11:22:50 +02:00
Jay Foad	1427d55d70	[TableGen] Document sequence with stride Document (in comments) the optional fourth "stride" argument to the sequence operator, which was added in svn r157416. Differential Revision: https://reviews.llvm.org/D133297	2022-09-07 09:58:22 +01:00
Vitaly Buka	4c18670776	[NFC][sancov] Rename ModuleSanitizerCoveragePass	2022-09-06 20:55:39 -07:00
Vitaly Buka	5e38b2a456	[NFC][msan] Rename ModuleMemorySanitizerPass	2022-09-06 20:30:35 -07:00
Vitaly Buka	93600eb50c	[NFC][asan] Rename ModuleAddressSanitizerPass	2022-09-06 15:02:11 -07:00
Vitaly Buka	e7bac3b9fa	[msan] Convert Msan to ModulePass MemorySanitizerPass function pass violatied requirement 4 of function pass to do not insert globals. Msan nees to insert globals for origin tracking, and paramereters tracking. https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class Reviewed By: kstoimenov, fmayer Differential Revision: https://reviews.llvm.org/D133336	2022-09-06 15:01:04 -07:00
Arthur Eubanks	7f57c97d30	[ThinLTOBitcodeWriter] Mark pass as required Or else with -opt-bisect-limit we don't write ThinLTO bitcode. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D133378	2022-09-06 14:47:34 -07:00
bzcheeseman	716b9f7a1a	[LLVM][Support/ADT] Add assert for isPresent to dyn_cast. This change adds an assert to dyn_cast that the value passed-in is present. In the past, this relied on the isa_impl assertion (which still works in many cases) but which we can tighten up for a better QoI. The PointerUnion change is because it seems like (based on the call sites) the semantics of the member dyn_cast are actually dyn_cast_if_present. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D133221	2022-09-06 13:58:56 -07:00
raghavmedicherla	142f51fc2f	Support: Add mapped_file_region::sync(), equivalent to msync Add mapped_file_region::sync(), equivalent to POSIX msync, synchronizing written content to disk without unmapping the region. Asserts if the mode is not mapped_file_region::readwrite. Note that I don't have access to a Windows machine, so I can't easily run those unit tests. Change by dexonsmith Differential Revision: https://reviews.llvm.org/D95494	2022-09-06 16:46:37 -04:00
Markus Böck	f049b2c3fc	[MC] Emit Stackmaps before debug info This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment. The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections. This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op. Differential Revision: https://reviews.llvm.org/D132708	2022-09-06 20:20:56 +02:00
Joseph Huber	58645d3252	[OpenMP] Fix `omp_get_wtime` function being marked incorrectly as readonly OpenMP has a list of of optimistic attributes that can be attached to known runtime functions to aid some analysis. The `omp_get_wtime` function incorrectly used the `readonly` attribute. This is not correct at the `omp_get_wtime` function changes values depending on some external state. This is more correctly modeled with `inaccessiblememonly` meaning that the value does not depend on anything within the module, but can not be removes as it depends on external state. Fixes #57578 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D133360	2022-09-06 12:59:00 -05:00
Jakub Kuderski	20573d11b7	[ADT] Remove is_splat `is_splat` is superseded by `all_equal` and marked as deprecated. See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D132336	2022-09-06 13:49:26 -04:00
Matthias Gehre	7948d89afe	Fix "[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64" compilation on Windows	2022-09-06 16:11:14 +01:00
Marco Elver	7d63983c65	[SelectionDAG] Properly copy ExtraInfo on RAUW During SelectionDAG legalization SDNodes with associated extra info may be replaced with a new SDNode. Preserve associated extra info on ReplaceAllUsesWith and remove entries in DeallocateNode. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130881	2022-09-06 16:32:50 +02:00
Marco Elver	cc3faf4226	[SelectionDAG] Rename CallSiteDbgInfo to NodeExtraInfo For information infrequently attached to SDNodes, it is useful to provide a way to add this information out-of-line. This is already done for call-site specific information. Rename CallSiteDbgInfo to NodeExtraInfo in preparation of adding additional information not necessarily related to call sites only. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130880	2022-09-06 16:32:50 +02:00
Matthias Gehre	2090e85fee	[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64 This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configured via a new TargetTransformInfo hook (default: no expansion) X86, Arm and AArch64 backends implement this hook to expand div/rem instructions with more than 128 bits. Differential Revision: https://reviews.llvm.org/D130076	2022-09-06 15:32:04 +01:00
Joseph Huber	5dbc7cf7ca	[Object] Refactor code for extracting offload binaries We currently extract offload binaries inside of the linker wrapper. Other tools may wish to do the same extraction operation. This patch simply factors out this handling into the `OffloadBinary.h` interface. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D132689	2022-09-06 08:55:16 -05:00
Marco Elver	42836e283f	[MachineInstr] Allow setting PCSections in ExtraInfo Provide MachineInstr::setPCSection(), to propagate relevant metadata through the backend. Use ExtraInfo to store the metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130876	2022-09-06 15:52:44 +02:00
Marco Elver	c70f6e1362	[Metadata] Introduce MD_pcsections Introduces MD_pcsections metadata kind. See added documentation for more details. Subsequent patches enable propagating PC sections metadata through code generation to the AsmPrinter. RFC: https://discourse.llvm.org/t/rfc-pc-keyed-metadata-at-runtime/64191 Reviewed By: dvyukov, vitalybuka Differential Revision: https://reviews.llvm.org/D130875	2022-09-06 15:52:44 +02:00
Amara Emerson	3dd861818a	[GlobalISel] Combine G_INSERT/EXTRACT_VECTOR_ELT with out of bounds indices to undef. Differential Revision: https://reviews.llvm.org/D133309	2022-09-06 13:45:04 +01:00
luxufan	2e7aed1947	[MemorySSA][NFC] Simplify if condition Differential Revision: https://reviews.llvm.org/D133332	2022-09-05 10:43:17 +00:00
Eli Friedman	63335afb4e	[ARM64EC 2/?] Add target triple, and allow targeting it. Part of patchset to add initial support for ARM64EC. Per discussion on review, using the triple arm64ec-pc-windows-msvc. The parsing works the same way as Apple's alternate Arm ABI "arm64e". Differential Revision: https://reviews.llvm.org/D125412	2022-09-05 12:27:10 -07:00
Eli Friedman	488ad99ecf	[ARM64EC 1/?] Add parsing support to llvm-objdump/llvm-readobj. This is the first patch of a patchset to add initial support for ARM64EC. Basic documentation is available at https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi . (Discourse post: https://discourse.llvm.org/t/initial-patches-for-arm64ec-windows-11-now-posted/62449 .) The file format for ARM64EC is basically identical to normal ARM64. There are a few extra sections, but the existing code for reading ARM64 object files just works. Differential Revision: https://reviews.llvm.org/D125411	2022-09-05 12:25:08 -07:00
Joseph Huber	c1d19a8489	[ELF] Provide the GNU hash function in libObject GNU uses a different hashing function compared to the sys-V standard function already provided in libObject. This is already used internally in LLD for generating synthetic sections. This patch simply extracts this definition and makes it availible to other users of `libObject`. This is done in preparation for supporting symbol name lookups via the GNU hash table. Reviewed By: MaskRay, jhenderson Differential Revision: https://reviews.llvm.org/D132696	2022-09-05 11:04:57 -05:00
Kazu Hirata	2bb43d72d9	[ADT] Use std::tuple_element_t (NFC)	2022-09-03 23:27:24 -07:00
Kazu Hirata	03c3c2db10	[llvm] Use std::remove_reference_t (NFC)	2022-09-03 23:27:22 -07:00

1 2 3 4 5 ...

49278 Commits