llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	d9659bf6a0	[OpenMP] Create custom state machines for generic target regions In the spirit of TRegions [0], this patch creates a custom state machine for a generic target region based on the potentially called parallel regions. The code analysis is done interprocedurally via an abstract attribute (AAKernelInfo). All outermost parallel regions are collected and we check if there might be unknown outermost parallel regions for which we need an indirect call. Other AAKernelInfo extensions are expected. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 Differential Revision: https://reviews.llvm.org/D101977	2021-07-10 17:57:08 -05:00
Joseph Huber	5ccb7424fa	[OpenMP] Change OpenMPOpt to check openmp metadata The metadata added in D102361 introduces a module flag that we can check to determine if the module was compiled with `-fopenmp` enables. We can now check for the precense of this instead of scanning the call graph for OpenMP runtime functions. Depends on D102361 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102423	2021-06-25 16:34:22 -04:00
Joseph Huber	44feacc736	[OpenMP] Change remaining globalization from an analysis remark to missed After landing the globalization optimizations, the precense of globalization on the device that was not put in shared or stack memory is a failed optimization with performance consequences so it should indicate a missed remark. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104735	2021-06-22 16:52:06 -04:00
Joseph Huber	7d69da71dd	[OpenMP] Enable HeapToStack conversion in OpenMPOpt for new RTL globalization calls Summary: The changes to globalization introduced in D97680 introduce a large amount of overhead by default. The old globalization method would always ignore globalization code if executing in SPMD mode. This wasn't strictly correct as data sharing is still possible in SPMD mode. The new interface is correct but introduces globalization code even when unnecessary. This optimization will use the existing HeapToStack transformation in the attributor to allow for unneeded globalization to be replaced with thread-private stack memory. This is done using the newly introduced library instances for the RTL functions added in D102087. Depends on D97818 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102197	2021-06-22 13:23:05 -04:00
Joseph Huber	68d133a3e8	[OpenMP] Simplify GPU memory globalization Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type. This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97680	2021-06-22 10:52:46 -04:00
Joseph Huber	b2ad63d3cf	[OpenMP] Add OpenMPOpt as a Module pass Summary: This patch registers OpenMPOpt as a Module pass in addition to a CGSCC pass. This is so certain optimzations that are sensitive to intact call-sites can happen before inlining. The old `openmpopt` pass name is changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass. The current module pass only runs a single check but will be expanded in the future. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99202	2021-04-20 12:28:58 -04:00

6 Commits