Commit Graph

16 Commits

Author SHA1 Message Date
Giorgis Georgakoudis 7cb4c26173 [OMPIRBuilder] Generate aggregate argument for parallel region outlined functions
Summary:
This patch modifies code generation in OpenMPIRBuilder to pass arguments
to the parallel region outlined function in an aggregate (struct),
besides the global_tid and bound_tid arguments. It depends on the
updated CodeExtractor (see D96854) for support. It mirrors functionality
of Clang codegen (see D102107).

Differential Revision: https://reviews.llvm.org/D110114
2022-01-25 20:53:45 -05:00
Johannes Doerfert 944aa0421c Reapply "[OpenMP][NFCI] Embed the source location string size in the ident_t"
This reverts commit 73ece231ee and
reapplies 7bfcdbcbf3 with mlir changes.
Also reverts commit 423ba12971 and
includes the unit test changes of
16da214004.
2021-12-29 01:10:38 -06:00
Mehdi Amini 73ece231ee Revert "[OpenMP][NFCI] Embed the source location string size in the ident_t"
This reverts commit 7bfcdbcbf3.
Broke MLIR build
2021-12-29 06:57:36 +00:00
Johannes Doerfert 7bfcdbcbf3 [OpenMP][NFCI] Embed the source location string size in the ident_t
One of the unused ident_t fields now holds the size of the string
(=const char *) field so we have an easier time dealing with those
in the future.

Differential Revision: https://reviews.llvm.org/D113126
2021-12-28 23:53:29 -06:00
Vyacheslav Zakharin 2e192ab1f4 [CodeExtractor] Preserve topological order for the return blocks.
Differential Revision: https://reviews.llvm.org/D108673
2021-08-25 08:09:01 -07:00
Johannes Doerfert c2281f1565 [Attributor] Introduce AAPointerInfo
This patch introduces AAPointerInfo which tracks the uses of a pointer
and places them in "bins" based on their offset from the base and access
size.

As with other AAs, any pointer can be tracked but it is up to the user
to make sense of the results. The user in this patch is AAValueSimplify
and AAPotentialValues which both utilize AAPointerInfo to determine the
value of a load. For now, this is restricted to loads of allocas and
internal globals. Through the use of AAPointerInfo and the "bins" we can
track struct members separately. The users also know that storing only
zeros (at unknown indices) will result in loading only 0 (from unknown
indices). Other than that, the users are flow and context insensitive
(for now).

To deal with the "bins" more easily, AAPointerInfo provides a
forallInterfearingAccesses that applies a callback on all accesses
that might interfere with a given load or store.

Differential Revision: https://reviews.llvm.org/D104432
2021-07-19 22:48:35 -05:00
Joseph Huber 5ccb7424fa [OpenMP] Change OpenMPOpt to check openmp metadata
The metadata added in D102361 introduces a module flag that we can check
to determine if the module was compiled with `-fopenmp` enables. We can
now check for the precense of this instead of scanning the call graph
for OpenMP runtime functions.

Depends on D102361

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102423
2021-06-25 16:34:22 -04:00
Joseph Huber 4c9471581f [Attributor] Set floating point loads and stores as nofree in AANoFreeFloating
Summary:
The current implementation of AANoFreeFloating will incorrectly list floating
point loads and stores as may-free. This prevents other attributor instances
like HeapToStack from pushing some allocations to the stack.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103975
2021-06-09 16:16:37 -04:00
Joseph Huber b2ad63d3cf [OpenMP] Add OpenMPOpt as a Module pass
Summary:
This patch registers OpenMPOpt as a Module pass in addition to a CGSCC
pass. This is so certain optimzations that are sensitive to intact
call-sites can happen before inlining. The old `openmpopt` pass name is
changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass.
The current module pass only runs a single check but will be expanded in
the future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D99202
2021-04-20 12:28:58 -04:00
Arthur Eubanks 6699029b67 [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Initially reverted due to BasicAA running analyses in an unspecified
order (multiple function calls as parameters), fixed by fetching
analyses before the call to construct BasicAA.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 21:08:54 -08:00
Arthur Eubanks ba9b4ea4ee Revert "[NewPM][opt] Run the "default" AA pipeline by default"
This reverts commit be611431cd.

Other/new-pm-lto-defaults.ll failing
2021-01-21 20:16:34 -08:00
Arthur Eubanks be611431cd [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 19:46:38 -08:00
Giorgis Georgakoudis 9751705512 [OpenMPOpt][WIP] Expand parallel region merging
The existing implementation of parallel region merging applies only to
consecutive parallel regions that have speculatable sequential
instructions in-between. This patch lifts this limitation to expand
merging with any sequential instructions in-between, except calls to
unmergable OpenMP runtime functions. In-between sequential instructions
in the merged region are sequentialized in a "master" region and any
output values are broadcasted to the following parallel regions and the
sequential region continuation of the merged region.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D90909
2021-01-11 08:06:23 -08:00
Arthur Eubanks 2577708054 [test] Merge parallel_region_merging{,_legacy_pm}.ll
These are identical except for the RUN lines.
Also pin legacy RUN line to legacy PM.
2020-11-24 08:56:53 -08:00
Fangrui Song 491dd2711f [LazyCallGraph] Build SCCs of the reference graph in order
```
// The legacy PM CGPassManager discovers SCCs this way:
for function in the source order
  tarjanSCC(function)

// While the new PM CGSCCPassManager does:
for function in the reversed source order [1]
  discover a reference graph SCC
  build call graph SCCs inside the reference graph SCC
```

In the common cases, reference graph ~= call graph, the new PM order is
undesired because for `a | b | c` (3 independent functions), the new PM will
process them in the reversed order: c, b, a. If `a <-> b <-> c`, we can see
that `-print-after-all` will report the sole SCC as `scc: (c, b, a)`.

This patch corrects the iteration order. The discovered SCC order will match
the legacy PM in the common cases.

For some tests (`Transforms/Inline/cgscc-*.ll` and
`unittests/Analysis/CGSCCPassManagerTest.cpp`), the behaviors are dependent on
the SCC discovery order and there are too many check lines for the particular
order.  This patch simply reverses the function order to avoid changing too many
check lines.

Differential Revision: https://reviews.llvm.org/D90566
2020-11-02 13:22:42 -08:00
Giorgis Georgakoudis 3a6bfcf2f9 [OpenMPOpt] Merge parallel regions
There are cases that generated OpenMP code consists of multiple,
consecutive OpenMP parallel regions, either due to high-level
programming models, such as RAJA, Kokkos, lowering to OpenMP code, or
simply because the programmer parallelized code this way.  This
optimization merges consecutive parallel OpenMP regions to: (1) reduce
the runtime overhead of re-activating a team of threads; (2) enlarge the
scope for other OpenMP optimizations, e.g., runtime call deduplication
and synchronization elimination.

This implementation defensively merges parallel regions, only when they
are within the same BB and any in-between instructions are safe to
execute in parallel.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D83635
2020-10-09 09:59:04 -07:00