Commit Graph

42238 Commits

Author SHA1 Message Date
Jeremy Morse 121a49d839 [LiveDebugValues] Add switches for using instr-ref variable locations
This patch adds the -Xclang option
"-fexperimental-debug-variable-locations" and same LLVM CodeGen option,
to pick which variable location tracking solution to use.

Right now all the switch does is pick which LiveDebugValues
implementation to use, the normal VarLoc one or the instruction
referencing one in rGae6f78824031. Over time, the aim is to add fragments
of support in aid of the value-tracking RFC:

  http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html

also controlled by this command line switch. That will slowly move
variable locations to be defined by an instruction calculating a value,
and a DBG_INSTR_REF instruction referring to that value. Thus, this is
going to grow into a "use the new kind of variable locations" switch,
rather than just "use the new LiveDebugValues implementation".

Differential Revision: https://reviews.llvm.org/D83048
2020-08-25 14:58:48 +01:00
David Sherwood 7b64765cd1 [SVE] Fix TypeSize related warnings with IR truncates of scalable vectors
In getCastInstrCost when the instruction is a truncate we were relying
upon the implicit TypeSize -> uint64_t cast when asking if a given type
has the same size as a legal integer. I've changed the code to only
ask the question if the type is fixed length.

I have also changed InstCombinerImpl::SimplifyDemandedUseBits to bail
out for now if the type is a scalable vector.

I've added the following new tests:

  Analysis/CostModel/AArch64/sve-trunc.ll
  Transforms/InstCombine/AArch64/sve-trunc.ll

for both of these fixes.

Differential revision: https://reviews.llvm.org/D86432
2020-08-25 09:17:56 +01:00
Freddy Ye e02d081f2b [X86] Support -march=sapphirerapids
Support -march=sapphirerapids for x86.
Compare with Icelake Server, it includes 14 more new features. They are
amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote,
enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D86503
2020-08-25 14:21:21 +08:00
Fangrui Song 44ee9d070a Revert D85812 "[coroutine] should disable inline before calling coro split"
This reverts commit 2e43acfed8.

LLVMCoroutines (the library which contains Coroutines.h) depends on LLVMipo (the
library which contains SampleProfile.cpp). It is inappropriate for
SampleProfile.cpp to depent on Coroutines.h (circular dependency).

The test inverted dependencies as well:
llvm/test/Transforms/Coroutines/coro-inline.ll uses -sample-profile.
2020-08-24 11:41:05 -07:00
Matt Arsenault 116affb18d TableGen/GlobalISel: Allow inst matcher to check multiple opcodes
This is to initially handleg immAllOnesV, which should match
G_BUILD_VECTOR or G_BUILD_VECTOR_TRUNC. In the future, it could be
used for other patterns cases that map to multiple G_* instructions,
such as G_ADD and G_PTR_ADD.
2020-08-24 13:48:51 -04:00
dongAxis 2e43acfed8 [coroutine] should disable inline before calling coro split
summary:
When callee coroutine function is inlined into caller coroutine
function before coro-split pass, llvm will emits "coroutine should
have exactly one defining @llvm.coro.begin". It seems that coro-early
pass can not handle this quiet well.
So we believe that unsplited coroutine function should not be inlined.
This patch fix such issue by not inlining function if it has attribute
"coroutine.presplit" (it means the function has not been splited) to
fix this issue

TestPlan: check-llvm

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D85812
2020-08-24 22:22:08 +08:00
Francesco Petrogalli 5a34b3ab95 [llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]
Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with
   `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an
   ordering function for the `ElementCount` class.
* Bits and pieces around printing the `ElementCount` to string streams.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Note that in some places the idiom

```
unsigned VF = ...
auto VTy = FixedVectorType::get(ScalarTy, VF)
```

has been replaced with

```
ElementCount VF = ...
assert(!VF.Scalable && ...);
auto VTy = VectorType::get(ScalarTy, VF)
```

The assertion guarantees that the new code is (at least in debug mode)
functionally equivalent to the old version. Notice that this change had been
possible because none of the methods that are specific to `FixedVectorType`
were used after the instantiation of `VTy`.

Reviewed By: rengolin, ctetreau

Differential Revision: https://reviews.llvm.org/D85794
2020-08-24 13:54:03 +00:00
Francesco Petrogalli bad7d6b373 Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]"
Reverting because the commit message doesn't reflect the one agreed on
phabricator at https://reviews.llvm.org/D85794.

This reverts commit c8d2b065b9.
2020-08-24 13:50:55 +00:00
Matt Arsenault e1644a3779 GlobalISel: Reduce G_SHL width if source is extension
shl ([sza]ext x, y) => zext (shl x, y).

Turns expensive 64 bit shifts into 32 bit if it does not overflow the
source type:

This is a port of an AMDGPU DAG combine added in
5fa289f0d8. InstCombine does this
already, but we need to do it again here to apply it to shifts
introduced for lowered getelementptrs. This will help matching
addressing modes that use 32-bit offsets in a future patch.

TableGen annoyingly assumes only a single match data operand, so
introduce a reusable struct. However, this still requires defining a
separate GIMatchData for every combine which is still annoying.

Adds a morally equivalent function to the existing
getShiftAmountTy. Without this, we would have to do try to repeatedly
query the legalizer info and guess at what type to use for the shift.
2020-08-24 09:42:40 -04:00
Francesco Petrogalli c8d2b065b9 [llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]
Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Add `<` operator to `ElementCount` to be able to use
`llvm::SmallSetVector<ElementCount, ...>`.
* Bits and pieces around printing the ElementCount to string streams.
* Added a static method to `ElementCount` to represent a scalar.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Differential Revision: https://reviews.llvm.org/D85794
2020-08-24 13:39:42 +00:00
Sam Parker 4ce176bed2 [SCEV] Still (again) trying to fix buildbots 2020-08-24 11:24:30 +01:00
Sam Parker 2e194fe73b [SCEV] Still trying to fix windows buildbots 2020-08-24 10:26:48 +01:00
Sam Parker e286c600e1 [SCEV] Attempt to fix windows buildbots 2020-08-24 08:29:22 +01:00
Sam Parker b999400a4f [SCEV] Add operand methods to Cast and UDiv
Add methods to access operands in a similar manner to NAryExpr.

Differential Revision: https://reviews.llvm.org/D86083
2020-08-24 06:57:07 +01:00
Craig Topper cc7bf9bcbf [X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets
gcc errors on this, but I'm nervous that since -mtune has been
ignored by clang for so long that there may be code bases out
there that pass 32-bit cpus to clang.
2020-08-22 16:38:05 -07:00
Matt Arsenault 901e3317fe GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS
This switches from using G_EXTRACT in odd cases to widen with undef
and unmerge.
2020-08-22 10:25:53 -04:00
Florian Hahn 5e7e2162d4 [DSE,MemorySSA] Use BatchAA for AA queries.
We can use BatchAA to avoid some repeated AA queries. We only remove
stores, so I think we will get away with using a single BatchAA instance
for the complete run.

The changes in AliasAnalysis.h mirror the changes in D85583.

The change improves compile-time by roughly 1%.
http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions

This is part of the patches to bring down compile-time to the level
referenced in
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86275
2020-08-22 08:36:35 +01:00
Sourabh Singh Tomar f91d18eaa9 [DebugInfo][flang]Added support for representing Fortran assumed length strings
This patch adds support for representing Fortran `character(n)`.

Primarily patch is based out of D54114 with appropriate modifications.

Test case IR is generated using our downstream classic-flang. We're in process
of upstreaming flang PR's but classic-flang has dependencies on llvm, so
this has to get in first.

Patch includes functional test case for both IR and corresponding
dwarf, furthermore it has been manually tested as well using GDB.

Source snippet:
```
 program assumedLength
   call sub('Hello')
   call sub('Goodbye')
   contains
   subroutine sub(string)
           implicit none
           character(len=*), intent(in) :: string
           print *, string
   end subroutine sub
 end program assumedLength
```

GDB:
```
(gdb) ptype string
type = character (5)
(gdb) p string
$1 = 'Hello'
```

Reviewed By: aprantl, schweitz

Differential Revision: https://reviews.llvm.org/D86305
2020-08-22 10:13:40 +05:30
Alina Sbirlea f55ad3973d [DomTree] Extend update API to allow a post CFG view.
Extend the `applyUpdates` in DominatorTree to allow a post CFG view,
different from the current CFG.
This patch implements the functionality of updating an already up to
date DT, to the desired PostCFGView.
Combining a set of updates towards an up to date DT and a PostCFGView is
not yet supported.

Differential Revision: https://reviews.llvm.org/D85472
2020-08-21 17:23:08 -07:00
Roman Lebedev 503deec218
Temporairly revert "[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline"
As disscussed in post-commit review starting with
	https://reviews.llvm.org/D84108#2227365
while this appears to be mostly a win overall, especially code-size-wise,
this appears to shake //certain// code pattens in a way that is extremely
unfavorable for performance (+30% runtime regression)
on certain CPU's (i personally can't reproduce).

So until the behaviour is better understood, and a path forward is mapped,
let's back this out for now.

This reverts commit 1d51dc38d8.
2020-08-22 00:33:22 +03:00
Nicolai Hähnle b37db11d95 MachineSSAUpdater: Allow initialization with just a register class
The register class is required for inserting PHIs, but the "current
virtual register" isn't actually used for anything, so let's remove it
while we're at it.

Differential Revision: https://reviews.llvm.org/D85602

Change-Id: I1e647f31570ef21a7ea8e20db3454178e98a6a8b
2020-08-21 23:04:35 +02:00
Alina Sbirlea 7ea0ee3058 [DomTree] Avoid creating an empty GD to reduce compile time. 2020-08-21 13:45:00 -07:00
Serguei Katkov 9e362bb0eb [InstCombine] Remove unused entries in gc-live bundle of statepoint
If some of gc live value are not used in gc.relocate we can remove them
from gc-live bundle of statepoint instruction.

Also the CL removes duplicated Values in gc-live bundle.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D85959
2020-08-22 01:36:22 +07:00
Kamau Bridgeman 365f861c45 [PowerPC][PCRelative] Thread Local Storage Support for Initial Exec
This patch is the initial support for the Intial Exec Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D81947
2020-08-21 10:13:11 -05:00
diggerlin a081868921 [AIX][XCOFF] emit symbol visibility for xcoff object file.
SUMMARY:

Reviewers:  Jason liu

Differential Revision: https://reviews.llvm.org/D84265
2020-08-21 11:00:56 -04:00
Florian Hahn 8eded24bf4 Recommit "[SCEVExpander] Add helper to clean up instrs inserted while expanding."
Recommit the patch after fixing an issue reported caused by the fact
that re-used values are also added to InsertedValues.

Additional tests have been added in 88818491b9

This reverts the revert commit 38884641f2.
2020-08-21 15:04:17 +01:00
Xing GUO f5643dc3dc Recommit: [DWARFYAML] Add support for referencing different abbrev tables.
The original commit (7ff0ace96db9164dcde232c36cab6519ea4fce8) was causing
build failure and was reverted in 6d242a7326

==================== Original Commit Message ====================
This patch adds support for referencing different abbrev tables. We use
'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly
assign an abbrev table to compilation units.

The syntax is:
```
debug_abbrev:
  - ID: 0
    Table:
      ...
  - ID: 1
    Table:
      ...
debug_info:
  - ...
    AbbrevTableID: 1 ## Reference the second abbrev table.
  - ...
    AbbrevTableID: 0 ## Reference the first abbrev table.
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D83116
2020-08-21 19:02:10 +08:00
Roman Lebedev 5d7c5a5e99
[NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
Yevgeny Rouban 18bc400f97 [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks
Both AfterPass and AfterPassInvalidated pass instrumentation
callbacks get additional parameter of type PreservedAnalyses.
This patch was created by @fedor.sergeev. I have just slightly
changed it.

Reviewers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81555
2020-08-21 16:10:42 +07:00
David Green 2b69efded0 [ARM][LV] Add a preferPredicatedReductionSelect target hook
As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980
2020-08-21 08:48:12 +01:00
Qiu Chaofan 91039784b3 [PowerPC] Add readflm/setflm intrinsics to Clang
Commit dbcfbffc adds ppc.readflm and ppc.setflm intrinsics to read or
write FPSCR register. This patch adds them to Clang.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D85874
2020-08-21 15:12:19 +08:00
Xing GUO 6d242a7326 Revert "[DWARFYAML] Add support for referencing different abbrev tables."
This reverts commit f7ff0ace96.

This change is causing build failure.

http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/10400
2020-08-21 12:15:54 +08:00
Yevgeny Rouban 7d9a16241f [ADT] Allow IsSizeLessThanThresholdT for incomplete types. NFC
If the type T is incomplete then sizeof(T) results in C++ compilation error at line:
  static constexpr bool value = sizeof(T) <= (2 * sizeof(void *));

This patch allows incomplete types in parameters of function. Example:
  using SomeFunc = void(SomeIncompleteType &);
  llvm::unique_function<SomeFuncType> SomeFunc;

Reviewers: DaniilSuchkov, vvereschaka

Differential Revision: https://reviews.llvm.org/D81554
2020-08-21 11:01:57 +07:00
Xing GUO f7ff0ace96 [DWARFYAML] Add support for referencing different abbrev tables.
This patch adds support for referencing different abbrev tables. We use
'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly
assign an abbrev table to compilation units.

The syntax is:
```
debug_abbrev:
  - ID: 0
    Table:
      ...
  - ID: 1
    Table:
      ...
debug_info:
  - ...
    AbbrevTableID: 1 ## Reference the second abbrev table.
  - ...
    AbbrevTableID: 0 ## Reference the first abbrev table.
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D83116
2020-08-21 11:44:25 +08:00
Xing GUO 290e399f96 [DWARFYAML] Add support for emitting multiple abbrev tables.
This patch adds support for emitting multiple abbrev tables. Currently,
compilation units will always reference the first abbrev table.

Reviewed By: jhenderson, labath

Differential Revision: https://reviews.llvm.org/D86194
2020-08-21 10:12:08 +08:00
Michael Liao 5257a60ee0 [amdgpu] Add codegen support for HIP dynamic shared memory.
Summary:
- HIP uses an unsized extern array `extern __shared__ T s[]` to declare
  the dynamic shared memory, which size is not known at the
  compile time.

Reviewers: arsenm, yaxunl, kpyzhov, b-sumner

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82496
2020-08-20 21:29:18 -04:00
Jon Roelofs 74ca5275e9 Fix a couple of typos. NFC 2020-08-20 14:56:57 -06:00
Kamau Bridgeman b74b80bb2d [PowerPC][PCRelative] Thread Local Storage Support for General Dynamic
This patch is the initial support for the General Dynamic Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Patch by: NeHuang

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D82315
2020-08-20 15:08:13 -05:00
Simon Pilgrim 0ee23b286a Fix Wdocumentation unknown parameter warning. NFC. 2020-08-20 12:41:34 +01:00
Vitaly Buka ebdc886b5f [APInt] Allow self-assignment with libstdc++
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8256/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Athinlto-function-summary-paramaccess.ll

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D86053
2020-08-20 04:14:40 -07:00
Sebastian Neubauer b8d1994778 [AMDGPU] Add A16/G16 to InstCombine
When sampling from images with coordinates that only have 16 bit
accuracy, convert the image intrinsic call to use a16 or g16.
This does only happen if the target hardware supports it.

An alternative would be to always apply this combination, independent of
the target hardware and extend 16 bit arguments to 32 bit arguments
during legalization. To me, this sounds like an unnecessary roundtrip
that could prevent some further InstCombine optimizations.

Differential Revision: https://reviews.llvm.org/D85887
2020-08-20 10:51:49 +02:00
Georgii Rymar a6436b0b3a [yaml2obj] - Make the 'Machine' key optional.
Currently we have to set 'Machine' to something in our
YAML descriptions. Usually we use 'EM_X86_64' for 64-bit targets
and 'EM_386' for 32-bit targets. At the same time, in fact, in most
cases our tests do not need a machine type and we can use
'EM_NONE'.

This is cleaner, because avoids the need of using a particular machine.

In this patch I've made the 'Machine' key optional (the default value,
when it is not specified is `EM_NONE`) and removed it (where possible)
from yaml2obj, obj2yaml and llvm-readobj tests.

There are few tests left where I decided not to remove it, because
I didn't want to touch CHECK lines or doing anything more complex
than a removing a "Machine: *" line and formatting lines around.

Differential revision: https://reviews.llvm.org/D86202
2020-08-20 11:40:51 +03:00
Bevin Hansson f03b10f57e [IR] Add FixedPointBuilder.
This patch adds a convenience class for using FixedPointSemantics
to build fixed-point operations in IR.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D85314
2020-08-20 10:29:57 +02:00
Bevin Hansson 1a995a0af3 [ADT] Move FixedPoint.h from Clang to LLVM.
This patch moves FixedPointSemantics and APFixedPoint
from Clang to LLVM ADT.

This will make it easier to use the fixed-point
classes in LLVM for constructing an IR builder for
fixed-point and for reusing the APFixedPoint class
for constant evaluation purposes.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html

Reviewed By: leonardchan, rjmccall

Differential Revision: https://reviews.llvm.org/D85312
2020-08-20 10:29:45 +02:00
Johannes Doerfert 2f38c755ba Revert "[IR] Intrinsics default attributes and opt-out flag"
This commit introduced a non-trivial compile time regression that needs
to be addressed: https://reviews.llvm.org/D70365#2227627
Given that it is unclear how long that will take, I'll revert it for
now.

This reverts commit eedf18fc1f.
2020-08-20 00:25:32 -05:00
Matt Arsenault 31adc28d24 GlobalISel: Implement fewerElementsVector for G_CONCAT_VECTORS sources
This fixes <6 x s16> = G_CONCAT_VECTORS from <3 x s16> handling.
2020-08-19 18:53:24 -04:00
Francesco Petrogalli dac0b1d330 [llvm] Add default constructor of `llvm::ElementCount`.
This patch prevents failures like those reported in
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/34173.

We have enabled the default constructor for
`llvm::ElementCount` to make sure the code compiles on Windows.

Reviewed By: ormris

Differential Revision: https://reviews.llvm.org/D86240
2020-08-19 21:39:24 +00:00
Sanjay Patel 6f3511a01a [ValueTracking] define/use max recursion depth in header
There's a potential motivating case to increase this limit in PR47191:
http://bugs.llvm.org/PR47191

But first we should make it less hacky. The limit in InstCombine is directly tied
to this value because an increase there can cause asserts in the underlying value
tracking calls if not changed together. The usage in VectorUtils is independent,
but the comment suggests that we should use the same value unless there's a known
reason to diverge. There are similar limits in codegen analysis, but I think we
should leave those independent in case we intentionally want the optimization
power/cost to be different there.

Differential Revision: https://reviews.llvm.org/D86113
2020-08-19 16:56:59 -04:00
Matt Arsenault adbcc8e733 GlobalISel: Add TargetLowering member to LegalizerHelper 2020-08-19 14:50:35 -04:00
Matt Arsenault e95c08432a GlobalISel: Use Register 2020-08-19 13:45:31 -04:00
Mehdi Amini a407ec9b6d Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private.""
Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.
2020-08-19 17:26:36 +00:00
Mehdi Amini 4fc56d70aa Revert "[NFC][llvm] Make the contructors of `ElementCount` private."
This reverts commit 264afb9e6a.
(and dependent 6b742cc48 and fc53bd610f)

MLIR/Flang are broken.
2020-08-19 17:21:37 +00:00
Jessica Paquette d25b12bdc3 [GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x
If we have a mask, and a value x, where (x & mask) == x, we can drop the AND
and just use x.

This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64.

In AArch64, this is most useful post-legalization. Patterns like this often
show up when legalizing s1s, which must be extended to larger types.

e.g.

```
%cmp:_(s32) = G_ICMP ...
%and:_(s32) = G_AND %cmp, 1
```

Since G_ICMP only produces a single bit, there's no reason to mask it with the
G_AND.

Differential Revision: https://reviews.llvm.org/D85463
2020-08-19 10:20:57 -07:00
Francesco Petrogalli 264afb9e6a [NFC][llvm] Make the contructors of `ElementCount` private.
Differential Revision: https://reviews.llvm.org/D86120
2020-08-19 16:26:44 +00:00
Bjorn Pettersson 54105d635d [GlobalISel] Untabify InstructionSelectorImpl.h. NFC 2020-08-19 12:00:00 +02:00
sstefan1 eedf18fc1f [IR] Intrinsics default attributes and opt-out flag
Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.

Differential Revision: https://reviews.llvm.org/D70365
2020-08-19 10:50:46 +02:00
Ronak Chauhan fdf71d486c Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors"
This reverts commit cacfb02d28.

Reverting due to buildbot failures.
2020-08-19 13:12:29 +05:30
David Sherwood 3f36561f69 [SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorLoads
In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only
ever deals with fixed width types, hence the offsets for each individual
store never take 'vscale' into account. I've changed the code in that
function to use TypeSize instead of unsigned for tracking the remaining
load amount. In addition, I've changed the load loop to use the new
IncrementPointer helper function for updating the addresses in each
iteration, since this handles scalable vector types.

Also, I've added report_fatal_errors in GenWidenVectorExtLoads,
TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores,
since these functions currently use a sequence of element-by-element
scalar loads/stores. In a similar vein, I've also added a fatal error
report in FindMemType for the case when we decide to return the element
type for a scalable vector type.

I've added new tests in

  CodeGen/AArch64/sve-split-load.ll
  CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll

for the changes in GenWidenVectorLoads.

Differential Revision: https://reviews.llvm.org/D85909
2020-08-19 07:54:32 +01:00
Yaxun (Sam) Liu 7546b29e76 [HIP] Support target id by --offload-arch
This patch introduces support of target id by
-offload-arch.

Differential Revision: https://reviews.llvm.org/D60620
2020-08-18 23:43:53 -04:00
Ronak Chauhan cacfb02d28 [AMDGPU] Support disassembly for AMDGPU kernel descriptors
Decode AMDGPU Kernel descriptors as assembler directives.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D80713
2020-08-19 08:49:07 +05:30
Elliott Hughes a7d0b7a786 ld128 demangle: allow space for 'L' suffix.
Summary:
Caught by HWASAN on arm64 Android (which uses ld128 for long double). This
was running the existing fuzzer.

The specific minimized fuzz input to reproduce this is:

  __cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0);

Reviewers: eugenis, srhines, #libc_abi!

Subscribers: kristof.beyls, danielkiss, libcxx-commits

Tags: #libc_abi

Differential Revision: https://reviews.llvm.org/D77924
2020-08-18 16:14:05 -07:00
Jessica Paquette bf36e90295 [GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList
It's annoying to have to maintain multiple, nearly identical chains of if
statements which all set the same attributes.

Add a helper function, `addFlagsUsingAttrFn` which performs the attribute
setting.

Then, use wrappers for that function in `lowerCall` and `setArgFlags`.

(Note that the flag-setting code in `setArgFlags` was missing the returned
attribute. There's no selection for this yet, so no test. It's an example of
the kind of thing this lets us avoid, though.)

Differential Revision: https://reviews.llvm.org/D86159
2020-08-18 11:07:33 -07:00
Matt Arsenault 5a15f6628e GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT
Add unit tests since AMDGPU will only trigger this for gigantic
vectors, and won't use the annoying odd sized breakdown case.
2020-08-18 13:51:19 -04:00
David Blaikie f7a49d2aa6 [WIP][DebugInfo] Lazily parse debug_loclist offsets
Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing
down memory usage of symbolizers that don't need to parse this data at
all. There's not much benefit to caching these anyway - since they are
O(1) lookup and reading once you know where the offset list starts (and
can do bounds checking with the offset list size too).

In general, I think it might be time to start paying down some of the
technical debt of loc/loclist/range/rnglist parsing to try to unify it a
bit more.

eg:

* Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection,
  LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if
  these were all wrapped up in two variables - one for loclists, one for
  rnglists)

* rnglists and loclists are handled differently (see:
  LoclistTableHeader, but no RnglistTableHeader)

* maybe all these types could be less stateful - lazily parse what they
  need to, even reparsing rather than caching because it doesn't seem
  too expensive, for instance. (though admittedly so long as it's
  constantcost/overead per compilatiton that's probably adequate)

* Maybe implementing and using a DWARFDataExtractor that can be
  sub-ranged (so we could slice it up to just the single contribution) -
  though maybe that's not so useful because loc/ranges need to refer to
  it by absolute, not contribution-relative mechanisms

Differential Revision: https://reviews.llvm.org/D86110
2020-08-18 10:49:39 -07:00
Amara Emerson 04a6ea5d77 [GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x
This is restricted to single use loads, which if we fold to sextloads we can
find more optimal addressing modes on AArch64.

This also fixes an overload the MachineFunction::getMachineMemOperand() method
which was incorrectly using the MF alignment instead of the MMO alignment.

Differential Revision: https://reviews.llvm.org/D85966
2020-08-18 10:42:15 -07:00
Amara Emerson 40e269ea6d [GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c'
By detecting this sign extend pattern early, we can uncover opportunities for
more optimizations.

Differential Revision: https://reviews.llvm.org/D85965
2020-08-18 10:42:15 -07:00
Jessica Paquette 224a8c639e [GlobalISel][CallLowering] Look through call parameters for flags
We weren't looking through the parameters on calls at all.

E.g., say you had

```
declare i32 @zext(i32 zeroext %x)

...
%y = call i32 @zext(i32 %something)
...

```

At the point of the call, we wouldn't know that the %something should have the
zeroext attribute.

This sets flags in about the same way as
TargetLoweringBase::ArgListEntry::setAttributes.

Differential Revision: https://reviews.llvm.org/D86125
2020-08-18 08:48:56 -07:00
Ronak Chauhan 7b777ee730 [ELF] Hide target specific methods as private
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86136
2020-08-18 18:26:08 +05:30
Ronak Chauhan e760e85680 [llvm-objdump][AMDGPU] Detect CPU string
AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly.
However, the AMDGPU target CPU is stored in e_flags in the ELF object.

This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D84519
2020-08-18 17:43:16 +05:30
Shinji Okumura 5e361e2aa4 [Attributor] Deduce noundef attribute
This patch introduces a new abstract attribute `AANoUndef` which corresponds to `noundef` IR attribute and deduce them.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85184
2020-08-18 18:05:54 +09:00
Johannes Doerfert 858c75f7d1 [Attributor][NFC] Directly return proper type to avoid casts 2020-08-17 23:36:36 -05:00
Harmen Stoppels a52173a3e5 Use find_library for ncurses
Currently it is hard to avoid having LLVM link to the system install of
ncurses, since it uses check_library_exists to find e.g. libtinfo and
not find_library or find_package.

With this change the ncurses lib is found with find_library, which also
considers CMAKE_PREFIX_PATH. This solves an issue for the spack package
manager, where we want to use the zlib installed by spack, and spack
provides the CMAKE_PREFIX_PATH for it.

This is a similar change as https://reviews.llvm.org/D79219, which just
landed in master.

Differential revision: https://reviews.llvm.org/D85820
2020-08-17 19:52:52 -07:00
Amy Kwan c7ec3a7e33 [PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang
This patch implements the vec_extractm function prototypes in altivec.h in
order to utilize the vector extract with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82675
2020-08-17 21:14:17 -05:00
Hamilton Tobon Mosquera 496f8e5b36 [OpenMPOpt][HideMemTransfersLatency] Split __tgt_target_data_begin_mapper into its "issue" and "wait" counterparts.
WIP that tries to hide the latency of runtime calls that involve host to
device memory transfers by splitting them into their "issue" and "wait"
versions. The "issue" is moved upwards as much as possible. The "wait" is
moved downards as much as possible. The "issue" issues the memory transfer
asynchronously, returning a handle. The "wait" waits in the returned
handle for the memory transfer to finish. We still lack of the movement.
2020-08-17 20:56:10 -05:00
Hongtao Yu 819b2d9c79 [llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.

So far only the X86 disassemblers are supported.

Test Plan:

llvm-objdump -d  --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
               	push	rax
               	mov	dword ptr [rsp + 4], 0
               	mov	dword ptr [rsp], 0
               	mov	eax, dword ptr [rsp]
               	cmp	eax, dword ptr [rip + 4112]  # 202182 <g>
               	jge	0x20117e <_start+0x25>
               	call	0x201158 <foo>
               	inc	dword ptr [rsp]
               	jmp	0x201169 <_start+0x10>
               	xor	eax, eax
               	pop	rcx
               	ret
```

llvm-objdump -d  **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
               	push	rax
               	mov	dword ptr [rsp + 4], 0
               	mov	dword ptr [rsp], 0
<L1>:
               	mov	eax, dword ptr [rsp]
               	cmp	eax, dword ptr  <g>
               	jge	 <L0>
               	call	 <foo>
               	inc	dword ptr [rsp]
               	jmp	 <L1>
<L0>:
               	xor	eax, eax
               	pop	rcx
               	ret
```

Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D84191
2020-08-17 16:55:12 -07:00
Matt Arsenault a128292b90 GlobalISel: Make type for lower action more consistently optional
Some of the lower implementations were relying on this, however the
type was not set depending on which form .lower* helper form you were
using. For instance, if you used an unconditonal lower(), the type was
never set. Most of the lower actions do not benefit from a type
parameter, and just expand in terms of the original operation's types.

However, some lowerings could benefit from an additional type hint to
combine a promotion and an expansion. An example of this is for
add/sub sat. The DAG integer legalization tries to use smarter
expansions directly when promoting the integer type, and doesn't
always produce the same instruction with a wider type.

Treat this as an optional hint argument, that only means something for
specific lower actions. It may be useful to generalize this mechanism
to pass a full list of type indexes and desired types, but I haven't
run into a case like that yet.
2020-08-17 16:24:55 -04:00
diggerlin 2f0d755d81 [AIX][XCOFF][Patch1] Provide decoding trace back table information API for xcoff object file for llvm-objdump -d
SUMMARY:

1. This patch provided API for decoding the traceback table info and unit test for the these API.

2. Another patchs will do the following things:
2.1 added a new option --traceback-table to decode the trace back table information for xcoff object file when
using llvm-objdump to disassemble the xcoff objfile.

2.2 print out the  traceback table information for llvm-objdump.

Reviewers:  Jason liu, Hubert Tong, James Henderson

Differential Revision: https://reviews.llvm.org/D81585
2020-08-17 16:23:47 -04:00
Dávid Bolvanský 0f14b2e6cb Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 50c743fa71. Patch will be split to smaller ones.
2020-08-17 20:44:33 +02:00
Valentin Clement 3060894bbb [flang][directives] Use TableGen to generate clause unparsing
Use the TableGen directive back-end to generate code for the clauses unparsing.

Reviewed By: sscalpone, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D85851
2020-08-17 14:22:25 -04:00
Matt Arsenault 5ca7c6386f GlobalISel: Fix parameter name in doxygen comment 2020-08-17 13:57:10 -04:00
Matt Arsenault fe171908e9 GlobalISel: Revisit users of other merge opcodes in artifact combiner
The artifact combiner searches for the uses of G_MERGE_VALUES for
unmerge/trunc that need further combining. This also needs to handle
the vector merge opcodes the same way. This fixes leaving behind some
pairs I expected to be removed, that were if the legalizer is run a
second time.
2020-08-17 13:56:53 -04:00
Matt Arsenault 924f31bc3c GlobalISel: Remove unnecessary check for copy type
COPY isn't allowed to change the type, but can mix no type with type.
2020-08-17 09:19:25 -04:00
Alex Zinenko 874aef875d [llvm] support graceful failure of DataLayout parsing
Existing implementation always aborts on syntax errors in a DataLayout
description. While this is meaningful for consuming textual IR modules, it is
inconvenient for users that may need fine-grained control over the layout from,
e.g., command-line options. Propagate errors through the parsing functions and
only abort in the top-level parsing function instead.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D85650
2020-08-17 15:10:37 +02:00
Simon Pilgrim c1f6ce0c73 [DemandedBits] Improve accuracy of Add propagator
The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits.

I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback.

This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read.

Known A:     0_______0_______0_______0_______
Known B:     0_______0_______0_______0_______
AOut:        00000000001000000000000000000000
AB, current: 00000000001111111111111111111111
AB, patch:   00000000001111111000000000000000

Committed on behalf of: @rrika (Erika)

Differential Revision: https://reviews.llvm.org/D72423
2020-08-17 12:54:09 +01:00
Vitaly Buka e10e7829bf [StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
2020-08-16 18:05:52 -07:00
Fady Ghanim aaa93a681b [OpenMP][OMPBuilder] Adding support for `omp single`
This adds support for generating `omp single`, and necessary calls for
`copyprivate` clause.

Differential Revision: https://reviews.llvm.org/D85617
2020-08-16 01:15:16 -04:00
Wenlei He 577e58bcc7 [InlineAdvisor] New inliner advisor to replay inlining from optimization remarks
This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context.

A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose.

This is a resubmit of https://reviews.llvm.org/D83743
2020-08-15 20:17:21 -07:00
Aditya Kumar 49a944af7f [NFC] Fix typo and variable names 2020-08-15 09:06:22 -07:00
Philip Reames 6b2105456a [Statepoint] Remove code related to inline operand bundles
This code becomes dead for valid IR after 48f4312 and a96fc46.  The reason for the test change is that the verifier reports the first verification error encountered, in some non-specified visit order.  By removing the verification code in gc.relocates for a statepoint with inline gc operands, I change the error the verifier reports.  And in one case, the checked for error is no longer possible with the bundle representation, so I simply delete the file.
2020-08-14 20:29:41 -07:00
Arthur Eubanks e6ea8779c2 [NewPM][optnone] Mark various passes as required
This was done by turning on -enable-npm-optnone and fixing failures.
That will be enabled in a follow-up change for ease of reverting.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D85457
2020-08-14 15:51:59 -07:00
Craig Topper c7a0b2684f [X86][MC][Target] Initial backend support a tune CPU to support -mtune
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.

This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.

One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.

I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.

Differential Revision: https://reviews.llvm.org/D85165
2020-08-14 15:31:50 -07:00
Jordan Rupprecht 38884641f2 Temporarily revert "[SCEVExpander] Add helper to clean up instrs inserted while expanding."
This reverts commit 7829c33084. The assertion is triggering on some internal code. A reduced test case is in progress.
2020-08-14 14:52:37 -07:00
Vitaly Buka fc4fd89852 [StackSafety] Use ValueInfo in ParamAccess::Call
This avoid GUID lookup in Index.findSummaryInModule.
Follow up for D81242.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85269
2020-08-14 12:42:44 -07:00
Greg McGary eef41efe00 [MachO] Add skeletal support for DriverKit platform
Define the platform ID = 10, and simple mappings between platform ID & name.

Reviewed By: MaskRay, cishida

Differential Revision: https://reviews.llvm.org/D85594
2020-08-14 12:36:43 -07:00
Matt Arsenault 5c5e6d951e TableGen/GlobalISel: Partially handle immAllOnesV/immAllZerosV
These should really match either G_BUILD_VECTOR or
G_BUILD_VECTOR_TRUNC, but there doesn't seem to be an existing
mechanism for matching alternative opcodes. There is GIM_SwitchOpcode,
but it seems to assume it's oly only used for matcher optimization.

I could also omit any opcode check and rely on the matcher directly
checking the opcode, but the table optimizer currently assumes there
has to be an opcode check.

Also doesn't try to handle undef elements like the DAG version.
2020-08-14 13:55:30 -04:00
Bjorn Pettersson b395d67a88 [Orc] Fix werror for unused variable in noasserts build 2020-08-14 15:58:04 +02:00
Stefan Gränitz 28e1015e32 [ORC] Fix missing include in OrcRemoteTargetClient.h 2020-08-14 12:00:18 +02:00
Stefan Gränitz 6bf74a924f [ORC] In LLLazyJIT provide public access to the CompileOnDemandLayer
This is analog to how LLJIT provides public access to all its layers.

Differential Revision: https://reviews.llvm.org/D85921
2020-08-14 11:34:44 +02:00
Stefan Gränitz 30c4561e36 [ORC] Add JITLink-compatible remote memory-manager and LLJITWithChildProcess example
This adds RemoteJITLinkMemoryManager is a new subclass of OrcRemoteTargetClient. It implements jitlink::JITLinkMemoryManager and targets the OrcRemoteTargetRPCAPI.

Behavior should be very similar to RemoteRTDyldMemoryManager. The essential differnce with JITLink is that allocations work in isolation from its memory manager. Thus, the RemoteJITLinkMemoryManager might be seen as "JITLink allocation factory".

RPCMMAlloc is another subclass of OrcRemoteTargetClient and implements the actual functionality. It allocates working memory on the host and target memory on the remote target. Upon finalization working memory is copied over to the tagrte address space. Finalization can be asynchronous for JITLink allocations, but I don't see that it makes a difference here.

Differential Revision: https://reviews.llvm.org/D85919
2020-08-14 11:34:44 +02:00
Yuanfang Chen a5ed20b549 [NewPM][CodeGen] Add machine code verification callback
D83608 need this.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D85916
2020-08-13 16:13:01 -07:00
Stefan Gränitz f12db8cf75 [ORC] cloneToNewContext() can work with a const-ref to ThreadSafeModule 2020-08-13 21:01:21 +02:00
Stefan Gränitz 34a5669ccd [ORC] Fix SymbolLookupSet::containsDuplicates() 2020-08-13 21:01:21 +02:00
Haowei Wu d650cbc349 [elfabi] Move llvm-elfabi related code to InterfaceStub library
This change moves elfabi related code to llvm/InterfaceStub library
so it can be shared by multiple llvm tools without causing cyclic
dependencies.

Differential Revision: https://reviews.llvm.org/D85678
2020-08-13 11:51:44 -07:00
Sameer Arora 8d58eb11f9 [llvm-libtool-darwin] Refactor ArchiveWriter
Refactoring function `writeArchive` in ArchiveWriter. Added a new
function `writeArchiveBuffer` that returns the archive in a memory
buffer instead of writing it out to the disk. This refactor is necessary
so as to allow `llvm-libtool-darwin` to write universal files containing
archives.

Reviewed by jhenderson, MaskRay, smeenai

Differential Revision: https://reviews.llvm.org/D84858
2020-08-13 10:56:30 -07:00
Dávid Bolvanský 50c743fa71 [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 19:54:27 +02:00
Fangrui Song 7f8c49b016 [llvm-objdump] Change symbol name/PLT decoding errors to warnings
If the referenced symbol of a J[U]MP_SLOT is invalid (e.g. symbol index 0), llvm-objdump -d will bail out:

```
error: 'a': st_name (0x326600) is past the end of the string table of size 0x7
```

where 0x326600 is the st_name field of the first entry past the end of .symtab

Change it to a warning to continue dumping.
`X86/plt.test` uses a prebuilt executable, so I pick `ELF/AArch64/plt.test`
which has a YAML input and can be easily modified.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85623
2020-08-13 08:13:42 -07:00
David Stenberg e8ebebb0bd [InstCombine] Fix incorrect Modified status
When removing instructions from unreachable blocks, and only debug info
intrinsics were removed, InstCombine could incorrectly return a false
Modified status.

This is fixed by making removeAllNonTerminatorAndEHPadInstructions()
also return how many debug info intrinsics that were removed, and take
that into account.

This was caught using the check introduced by D80916.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D85839
2020-08-13 15:10:41 +02:00
Dávid Bolvanský f9264995a6 Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 44587e2f7e. Sanitizer tests need to be updated.
2020-08-13 14:37:40 +02:00
Dávid Bolvanský 44587e2f7e [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 14:23:58 +02:00
Dávid Bolvanský a0485421d2 Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 385c9d673f.
2020-08-13 12:59:15 +02:00
Dávid Bolvanský 385c9d673f [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 12:45:40 +02:00
Xing GUO b7d5d1ec64 [DWARFYAML] Replace InitialLength with Format and Length. NFC.
This change replaces the InitialLength of pub-tables with Format and
Length. All the InitialLength fields have been removed.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85880
2020-08-13 18:39:03 +08:00
David Sherwood 6af1677161 [SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorStores
In DAGTypeLegalizer::GenWidenVectorStores the algorithm assumes it only
ever deals with fixed width types, hence the offsets for each individual
store never take 'vscale' into account. I've changed the main loop in
that function to use TypeSize instead of unsigned for tracking the
remaining store amount and offset increment. In addition, I've changed
the loop to use the new IncrementPointer helper function for updating
the addresses in each iteration, since this handles scalable vector
types.

Whilst fixing this function I also fixed a minor issue in
IncrementPointer whereby we were not adding the no-unsigned-wrap flag
for the add instruction in the same way as the fixed width case does.

Also, I've added a report_fatal_error in GenWidenVectorTruncStores,
since this code currently uses a sequence of element-by-element scalar
stores.

I've added new tests in

  CodeGen/AArch64/sve-intrinsics-stores.ll
  CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll

for the changes in GenWidenVectorStores.

Differential Revision: https://reviews.llvm.org/D84937
2020-08-13 11:07:17 +01:00
David Sherwood 3ec3fcb97a [CodeGen] In narrowExtractedVectorLoad bail out for scalable vectors
In narrowExtractedVectorLoad there is an optimisation that tries to
combine extract_subvector with a narrowing vector load. At the moment
this produces warnings due to the incorrect calls to
getVectorNumElements() for scalable vector types. I've got this
working for scalable vectors too when the extract subvector index
is a multiple of the minimum number of elements. I have added a
new variant of the function:

  MachineFunction::getMachineMemOperand

that copies an existing MachineMemOperand, but replaces the pointer
info with a null version since we cannot currently represent scaled
offsets.

I've added a new test for this particular case in:

  CodeGen/AArch64/sve-extract-subvector.ll

Differential Revision: https://reviews.llvm.org/D83950
2020-08-13 10:46:18 +01:00
Amara Emerson 2ff14957e8 [GlobalISel] Implement bit-test switch table optimization.
This is mostly a straight port from SelectionDAG. We re-use the actual bit-test
analysis part from SwitchLoweringUtils, which was factored out earlier to
support jump-tables.

Differential Revision: https://reviews.llvm.org/D85233
2020-08-12 11:31:39 -07:00
Christopher Tetreault 1da09b7214 [SVE] Remove default-false VectorType::get
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84212
2020-08-12 10:37:05 -07:00
David Green f07f17ac7c [Scheduler] Fix typo in comments. NFC 2020-08-12 18:36:05 +01:00
Xing GUO e891b6a75d [DWARFYAML] Make the address size of compilation units optional.
This patch makes the 'AddrSize' field optional. If the address size is
missing, yaml2obj will infer it from the object file.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85805
2020-08-12 21:47:32 +08:00
Igor Kudrin 9ceb192e14 [llvm-dwarfdump] Avoid crashing if an abbreviation offset is invalid.
Note that DWARFUnit::getAbbreviations() returns nullptr if the
abbreviations could not be read, but callers used the returned
pointer without checking.

Differential Revision: https://reviews.llvm.org/D85738
2020-08-12 16:01:53 +07:00
David Sherwood 88bbd30736 [SVE][CodeGen] Fix issues with EXTRACT_SUBVECTOR when using scalable FP vectors
In this patch I have fixed two issues:

1. Our SVE tuple get/set intrinsics were using the wrong constant type
for the index passed to EXTRACT_SUBVECTOR. I have fixed this by using the
function SelectionDAG::getVectorIdxConstant to create the value. Also, I
have updated the documentation for EXTRACT_SUBVECTOR describing what type
the constant index should be and we now enforce this when creating the
node.
2. The AArch64 backend was missing the appropriate patterns for
extracting certain subvectors (nxv4f16 and nxv2f32) from legal SVE types.
I have added them as part of this patch.

The only way that I could find to test the new patterns was to use the
SVE tuple get intrinsics, although I realise it looks a bit unusual.
Tests added here:

  test/CodeGen/AArch64/sve-extract-subvector.ll

Differential Revision: https://reviews.llvm.org/D85516
2020-08-12 08:35:46 +01:00
Kiran Chandramohan e6c5e6efd0 [MLIR,OpenMP] Lowering of parallel operation: proc_bind clause 2/n
This patch adds the translation of the proc_bind clause in a
parallel operation.

The values that can be specified for the proc_bind clause are
specified in the OMP.td tablegen file in the llvm/Frontend/OpenMP
directory. From this single source of truth enumeration for
proc_bind is generated in llvm and mlir (used in specification of
the parallel Operation in the OpenMP dialect). A function to return
the enum value from the string representation is also generated.
A new header file (DirectiveEmitter.h) containing definitions of
classes directive, clause, clauseval etc is created so that it can
be used in mlir as well.

Reviewers: clementval, jdoerfert, DavidTruby

Differential Revision: https://reviews.llvm.org/D84347
2020-08-12 08:03:13 +01:00
Petr Hosek 31e5f7120b [CMake] Simplify CMake handling for zlib
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.

This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.

Differential Revision: https://reviews.llvm.org/D79219
2020-08-11 20:22:11 -07:00
Vedant Kumar 30c1633386 Revert "[Instruction] Add updateLocationAfterHoist helper"
This reverts commit 4a646ca9e2.

This is causing some bots to fail with "!dbg attachment points at wrong
subprogram for function", like:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/67958/steps/stage%201%20check/logs/stdio
2020-08-11 14:54:09 -07:00
Vedant Kumar 4a646ca9e2 [Instruction] Add updateLocationAfterHoist helper
Introduce a helper on Instruction which can be used to update the debug
location after hoisting.

Use this in GVN and LICM, where we were mistakenly introducing new line
0 locations after hoisting (the docs recommend dropping the location in
this case).

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-08-11 14:05:20 -07:00
Nikita Popov 06d567059e [InstSimplify] Respect CanUseUndef in more places
Similar to what we do in IIQ, add an isUndefValue() helper that
checks for undef values while respective CanUseUndef. This makes
it much easier to search for places that don't respect the flag
yet.
2020-08-11 21:53:33 +02:00
diggerlin e9ac1495e2 [AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations
SUMMARY:

1. in the patch  , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF
there are

XCOFF::StorageMappingClass MappingClass;
XCOFF::SymbolType Type;
XCOFF::StorageClass StorageClass;
in the MCSectionXCOFF class,
these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass)

we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time.

actually , we can get the StorageClass of the MCSectionXCOFF  from it's delegated symbol.

2. we also change the oprand of branch instruction from symbol name to qualify symbol name.
for example change
bl .foo
extern .foo
to
bl .foo[PR]
extern .foo[PR]

3. and if there is reference indirect call a function bar.
we also add
  extern .bar[PR]

Reviewers:  Jason liu, Xiangling Liao

Differential Revision: https://reviews.llvm.org/D84765
2020-08-11 15:26:19 -04:00
Jessica Paquette bebe6a6449 [GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y))
This implements

```
(logic_op (op x...), (op y...)) -> (op (logic_op x, y))
```

when `op` is an extend, a shift, or an and.

This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands`
(with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.)

This is implemented so it works both pre and post-legalization.

This also adds a general way to add a series of instructions in a combine.
(`applyBuildInstructionSteps`).

Differential Revision: https://reviews.llvm.org/D85050
2020-08-11 10:40:06 -07:00
Matt Arsenault 8dd2eb10bb GlobalISel: Fix typo 2020-08-11 13:08:56 -04:00
Lang Hames eed19c8c7e [ORC] Move file-descriptor based raw byte channel into a public header.
This will enable re-use in other llvm tools.
2020-08-11 09:50:58 -07:00
Nikita Popov d110d4aaff [InstSimplify] Forbid undef folds in expandBinOp
This is the replacement for D84250 based on D84792. As we recursively
fold with the same value twice, we need to disable undef folds,
to prevent an undef from being folded to two different values.

Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the
test case from https://reviews.llvm.org/D83360#2145793, it no longer
performs the incorrect fold.

Differential Revision: https://reviews.llvm.org/D85684
2020-08-11 18:39:24 +02:00
Jay Foad fa2b836ea3 [GlobalISel] Add G_ABS
This is equivalent to the new llvm.abs intrinsic added by D84125 with
is_int_min_poison=0.

Differential Revision: https://reviews.llvm.org/D85718
2020-08-11 16:34:37 +01:00
Sanjay Patel 1470ce4a76 [InstSimplify] fold min/max with matching min/max operands
I think this is the last remaining translation of an existing
instcombine transform for the corresponding cmp+sel idiom.

This interpretation is more general though - we can remove
mismatched signed/unsigned combinations in addition to the
more obvious cases.

min/max(X, Y) must produce X or Y as the result, so this is
just another clause in the existing transform that was already
matching a min/max of min/max.
2020-08-11 11:23:15 -04:00
Valentin Clement 16c1d251c4 [flang][directives] Use TableGen information for clause classes in parse-tree
This patch takes advantage of the directive information and tablegen generation
to replace the clauses class parse tree and in the dump parse tree sections.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D85549
2020-08-11 10:44:14 -04:00
Matt Arsenault e2f1b48f86 GlobalISel: Implement bitcast action for G_INSERT_VECTOR_ELT
This mirrors the support for the equivalent extracts. This also
creates a huge mess that would be greatly improved if we had any bit
operation combines.
2020-08-11 10:39:14 -04:00
Matt Arsenault 53f21e0fb7 TableGen/GlobalISel: Hack the operand order for atomic_store
ISD::ATOMIC_STORE arbitrarily has the operands in the opposite order
from regular ISD::STORE, which always introduced an annoying
duplication of patterns to handle both cases. Since in GlobalISel
there's just the one G_STORE, we need to swap the operands to
correctly emit the type check for the pointer operand.

Some work started in 20aafa3156 to
migrate SelectionDAG to use ISD::STORE for atomics, but that work
seems to have stalled. Since this is the pretty much the last
operation which matters which isn't supported for AMDGPU, use this
compatibility hack to unblock declaring it functionally complete.

Not sure what's going on with the pending_phis AArch64 test. It seems
it didn't always use atomics, and I'm not sure what it was originally
testing matters anymore.
2020-08-11 10:22:44 -04:00
clementval 3b3dc1dbff Revert "[flang][directives] Use TableGen information for clause classes in parse-tree"
This reverts commit bf93edc475.

Buildbot failure
2020-08-11 09:54:04 -04:00
Valentin Clement bf93edc475 [flang][directives] Use TableGen information for clause classes in parse-tree
This patch takes advantage of the directive information and tablegen generation
to replace the clauses class parse tree and in the dump parse tree sections.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D85549
2020-08-11 09:43:11 -04:00
David Stenberg 91bd9db2cd [DebugInfo] Allow GNU macro extension to be read
Allow the GNU .debug_macro extension to be parsed and printed by
llvm-dwarfdump. In an upcoming patch support will be added for emitting
that format also.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D82974
2020-08-11 13:30:52 +02:00
David Stenberg 2892ed6d0f [DebugInfo] Introduce GNU macro extension entry encodings
This is a preparatory patch for allowing the GNU .debug_macro extension,
which is a precursor to the DWARF 5 format, to be emitted by LLVM for
earlier DWARF versions.

The entries share the same encoding and behavior as in DWARF5; there are
just more entries in the DWARF 5 format. Therefore, we could have used
those existing DWARF 5 entries, but I think that explicitly referring to
the GNU macro variants makes the code more clear.

The defines that this patch introduces can be found in GCC in the dwarf2.h header:
  https://gcc.gnu.org/git/?p=gcc.git;a=blob;
  f=include/dwarf2.h;
  h=0b6facfd4cf4c02320c7328114231b128ab42d5e;
  hb=dccbf1e2a6e544f71b4a5795f0c79015db019fc3#l425

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D82972
2020-08-11 13:30:52 +02:00
Kerry McLaughlin 85c7e89f3b [CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize
Changes the Offset arguments to both functions from int64_t to TypeSize
& updates all uses of the functions to create the offset using TypeSize::Fixed()

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85220
2020-08-11 12:17:10 +01:00
Kai Nacke b3aece0531 [SystemZ/ZOS] Add binary format goff and operating system zos to the triple
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.

Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay

Reviewed By: efriedma, tahonermann, hubert.reinterpertcast

Differential Revision: https://reviews.llvm.org/D82081
2020-08-11 05:26:26 -04:00
Florian Hahn 7829c33084 [SCEVExpander] Add helper to clean up instrs inserted while expanding.
SCEVExpander already tracks which instructions have been inserted n
InsertedValues/InsertedPostIncValues. This patch adds an additional
vector to collect the instructions in insertion order. This can then be
used to remove exactly the instructions inserted by the expander.

This replaces ExpandedValuesCleaner, which in some cases might remove
values not inserted by the expander (e.g. if a value was dead before
insertion and is then used during expansion).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D84327
2020-08-11 09:30:31 +01:00
Shinji Okumura 06eee8748f [Attributor][NFC] Connect AAPotentialValues with AAValueSimplify
This patch enables `AAValueSimplify` to use information from `AAPotentialValues`

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85668
2020-08-11 15:52:02 +09:00
Haowei Wu db91320a89 Revert "Move ELFObjHandler to TextAPI library"
This reverts commit e6f8ba12e6 due
to build failures.
2020-08-10 21:31:29 -07:00
Haowei Wu e6f8ba12e6 Move ELFObjHandler to TextAPI library
This change moves ELFObjHandler to llvm/TextAPI library so it can
be used by different llvm tools.
2020-08-10 21:23:39 -07:00
Wang, Pengfei 9512525947 [X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics.
When we use mask compare intrinsics under strict FP option, the masked
elements shouldn't raise any exception. So, we cann't replace the
intrinsic with a full compare + "and" operation.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D85385
2020-08-11 10:28:41 +08:00
Yuanfang Chen d04f3e028d [CodeGen] Make MMI immutable NPM pass 2020-08-10 17:52:42 -07:00
Lang Hames 6fd30f0669 [llvm-jitlink] Update llvm-jitlink to use TargetProcessControl. 2020-08-10 17:19:48 -07:00
Johannes Doerfert fa5d22a045 [OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang
Replace the `ident_t` handling in Clang with the methods offered by the
OMPIRBuilder. This cuts down on the clang code as well as the
differences between the two, making further transitions easier. Tests
have changed but there should not be a real functional change. The most
interesting difference is probably that we stop generating local ident_t
allocations for now and just use globals. Given that this happens only
with debug info, the location part of the `ident_t` is probably bigger
than the test anyway. As the location part is already a global, we can
avoid the allocation, memcpy, and store in favor of a constant global
that is slightly bigger. This can be revisited if there are
complications.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D80735
2020-08-10 17:13:26 -05:00
jasonliu 20abff0481 [XCOFF][AIX] Use TE storage mapping class when large code model is enabled
Summary:
Use TE SMC instead of TC SMC in large code model mode,
so that large code model TOC entries could get placed after all
the small code model TOC entries, which reduces the chance of TOC overflow.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D85455
2020-08-10 19:52:10 +00:00
Craig Topper 96dfc783b2 [BreakFalseDeps][X86] Move operand loop out of X86's getUndefRegClearance and put in the pass.
X86 is the only user of this interface in tree. Previously the
X86 pass would loop over operands looking for one undef operand for
the pass to fix. But there could theoretically be multiple operands
to fix. So it makes more sense for the pass to do the looping and
ask the target if an operand needs to be fixed.
2020-08-10 10:32:29 -07:00
Xiangling Liao 6ef801aa6b [AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to
use the linkage type and function names Clang chooses for sinit related function.

On the backend side, this patch sets correct linkage and function names on aliases
created for sinit/sterm functions.

Differential Revision: https://reviews.llvm.org/D84534
2020-08-10 10:10:49 -04:00
James Henderson ca05601cd2 [DebugInfo] Don't error for zero-length arange entries
Although the DWARF specification states that .debug_aranges entries
can't have length zero, these can occur in the wild. There's no
particular reason to enforce this part of the spec, since functionally
they have no impact. The patch removes the error and introduces a new
warning for premature terminator entries which does not stop parsing.

This is a relanding of cb3a598c87, adding the missing obj2yaml part
that was needed.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also
https://reviews.llvm.org/D71932 which originally introduced the error.

Reviewed by: ikudrin, dblaikie, Higuoxing

Differential Revision: https://reviews.llvm.org/D85313
2020-08-10 14:57:52 +01:00
Matt Arsenault f9c279b057 PeepholeOptimizer: Use Register 2020-08-10 08:49:36 -04:00
Sanjay Patel 3d5118b75c [InstCombine] auto-generate test checks; NFC 2020-08-10 08:27:38 -04:00
Nico Weber bc5d68dd8a Revert "[DebugInfo] Don't error for zero-length arange entries"
This reverts commit cb3a598c87.
Breaks build of check-llvm dep obj2yaml everywhere.
2020-08-10 08:20:35 -04:00
James Henderson cb3a598c87 [DebugInfo] Don't error for zero-length arange entries
Although the DWARF specification states that .debug_aranges entries
can't have length zero, these can occur in the wild. There's no
particular reason to enforce this part of the spec, since functionally
they have no impact. The patch removes the error and introduces a new
warning for premature terminator entries which does not stop parsing.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also
https://reviews.llvm.org/D71932 which originally introduced the error.

Reviewed by: ikudrin, dblaikie

Differential Revision: https://reviews.llvm.org/D85313
2020-08-10 12:48:31 +01:00
Qiu Chaofan dbcfbffc7a [PowerPC] Add intrinsic to read or set FPSCR register
This patch introduces two intrinsics: llvm.ppc.setflm and
llvm.ppc.readflm. They read from or write to FPSCR register
(floating-point status & control) which contains rounding mode and
exception status.

To ensure correctness of program, we need to prevent FP operations from
being moved across these intrinsics (mffs/mtfsf instruction), so here I
set them as scheduling boundaries. We can relax such restriction if
FPSCR is modeled well in the future.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84914
2020-08-10 18:27:45 +08:00
Petar Avramovic 0d58d9e8fb AMDGPU/GlobalISel: Lower G_FREM
Add custom lower for G_FREM.

Differential Revision: https://reviews.llvm.org/D84324
2020-08-10 10:10:46 +02:00
Vitaly Buka fbd33baa27 [NFC][Attributor] Add missing override 2020-08-09 23:30:42 -07:00
Shinji Okumura ff1002aab0 [Attributor][NFC][AAPotentialValues] Change interface of PotentialValuesState
Previously `PotentialValuesState` inherited `BooleanState`.
We have to add `getAssumed` to the state in order to use `clampStateAndIndicateChange` (which will be used in `AAPotentialValuesArgument`).
However `BooleanState::getAssumed` is not a virtual function and we cannot override it.
Therefore, I changed the state not to inherit `BooleanState` and add `getAssumed` to it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85610
2020-08-10 09:18:10 +09:00
Florian Hahn 5a0d6cdbd1 [InstSimplify] Make sure CanUseUndef is initialized in all cases.
This should fix a bunch of buildbot failures.
2020-08-09 19:47:16 +01:00
Florian Hahn d236e1c7b6 [InstSimplify/NewGVN] Add option to control the use of undef.
Making use of undef is not safe if the simplification result is not used
to replace all uses of the result. This leads to problems in NewGVN,
which does not replace all uses in the IR directly. See PR33165 for more
details.

This patch adds an option to SimplifyQuery to disable the use of undef.

Note that I've only guarded uses if isa<UndefValue>/m_Undef where
SimplifyQuery is currently available. If we agree on the general
direction, I'll update the remaining uses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84792
2020-08-09 19:16:56 +01:00
Florian Hahn c70f0b9d4a [SCEVExpander] Avoid re-using existing casts if it means updating users.
Currently the SCEVExpander tries to re-use existing casts, even if they
are not exactly at the insertion point it was asked to create the cast.
To do so in some case, it creates a new cast at the insertion point and
updates all users to use the new cast.

This behavior is problematic, because it changes the IR outside of the
instructions created during the expansion. Therefore we cannot
completely undo all changes made during expansion.

This re-use should be only an extra optimization, so only using the new
cast in the expanded instructions should not be a correctness issue.
There are many cases equivalent instructions are created during
expansion.

This patch also adjusts findInsertPointAfter to skip instructions
inserted during expansion. This enables re-using existing casts without
the renaming any uses, by picking a better insertion point.

Reviewed By: efriedma, lebedev.ri

Differential Revision: https://reviews.llvm.org/D84399
2020-08-09 13:25:17 +01:00
Petr Hosek a4d78d23c5 Revert "[CMake] Simplify CMake handling for zlib"
This reverts commit ccbc1485b5 which
is still failing on the Windows MLIR bots.
2020-08-08 17:08:23 -07:00
Petr Hosek ccbc1485b5 [CMake] Simplify CMake handling for zlib
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.

This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.

Differential Revision: https://reviews.llvm.org/D79219
2020-08-08 16:44:08 -07:00
Yuanfang Chen f5b5ccf2a6 Reland "Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager""
This relands commit 320eab2d55.

The test failed because it was looking for x86-linux target
unconditionally. Now it gets the default target.
2020-08-07 16:40:49 -07:00
Arthur Eubanks 7abef41674 [NewPM] Print 'Skipping pass' as pass instrumentation
If OptNoneInstrumentation prints it instead, 'Skipping pass' will print for even required passes.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85493
2020-08-07 15:02:02 -07:00
Sameer Arora 645de3664a [llvm-libtool-darwin] Add constant CPU_SUBTYPE_ARM64_V8
Add support for constant MachO::CPU_SUBTYPE_ARM64_V8. This constant is
needed so as to match `llvm-libtool-darwin`'s behavior to that of
cctools' libtool when `-arch_only` flag is passed in on command line.

Reviewed by jhenderson, alexshap, smeenai

Differential Revision: https://reviews.llvm.org/D85041
2020-08-07 14:09:27 -07:00
Vitaly Buka 7547508b7a Revert "[StackSafety] Skip ambiguous lifetime analysis"
This reverts commit 0b2616a804.

Crashes with safe-stack.
2020-08-07 14:02:50 -07:00
Matt Arsenault 5a0b1472c0 GlobalISel: Handle zext(sext x) in artifact combiner
This eliminates the illegal intermediate s8 value in the added test.
2020-08-07 16:37:46 -04:00
Yuanfang Chen 320eab2d55 Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager"
This reverts commit 911565d108.

Broke some non-Linux bots.
2020-08-07 11:59:58 -07:00
Yuanfang Chen 911565d108 [NewPM][CodeGen] Introduce machine pass and machine pass manager
machine pass could define four methods:
- `PreservedAnalyses run(MachineFunction &, MachineFunctionAnalysisManager &)`
- `Error doInitialization(Module &, MachineFunctionAnalysisManager &)`
- `Error doFinalization(Module &, MachineFunctionAnalysisManager &)`
- `Error run(Module &, MachineFunctionAnalysisManager &)`

machine pass manger:
- MachineFunctionAnalysisManager:
  Basically an AnalysisManager<MachineFunction> augmented with the ability to
  register and query IR analyses
- MachineFunctionPassManager: support only two methods, `addPass` and `run`

Reviewed By: arsenm, asbirlea, aeubanks

Differential Revision: https://reviews.llvm.org/D67687
2020-08-07 11:00:31 -07:00
Yuanfang Chen 954bd9c861 [NewPM] Only verify loop for nonskipped user loop pass
No verification for pass mangers since it is not needed.
No verification for skipped loop pass since the asserted condition is not used.

Add a BeforeNonSkippedPass callback for this. The callback needs more
inputs than its parameters to work so the callback is added on-the-fly.

Reviewed By: aeubanks, asbirlea

Differential Revision: https://reviews.llvm.org/D84977
2020-08-07 11:00:31 -07:00
Bevin Hansson 5de6c56f7e [Intrinsic] Add sshl.sat/ushl.sat, saturated shift intrinsics.
Summary:
This patch adds two intrinsics, llvm.sshl.sat and llvm.ushl.sat,
which perform signed and unsigned saturating left shift,
respectively.

These are useful for implementing the Embedded-C fixed point
support in Clang, originally discussed in
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125433.html
and
http://lists.llvm.org/pipermail/cfe-dev/2018-May/058019.html

Reviewers: leonardchan, craig.topper, bjope, jdoerfert

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83216
2020-08-07 15:09:24 +02:00
Igor Kudrin b6b0ff18a3 [DebugInfo] Clean up DIEUnit. NFC.
This removes members of the DIEUnit class which were used only in unit
tests. Note also that child classes shadowed some of these methods,
namely, getDwarfVersion() was overridden in DwartfUnit and getLength()
was overridden in DwarfCompileUnit.

Differential Revision: https://reviews.llvm.org/D85436
2020-08-07 15:55:44 +07:00
Shinji Okumura c575ba28de [Attributor] AAPotentialValues Interface
This is a split patch of D80991.
This patch introduces AAPotentialValues and its interface only.
For more detail of AAPotentialValues abstract attribute, see the original patch.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D83283
2020-08-07 17:35:12 +09:00
Christian Kühnel f3cc4df51d Revert "[CMake] Simplify CMake handling for zlib"
This reverts commit 1adc494bce.
This patch broke the Windows compilation on buildbot and pre-merge testing:
http://lab.llvm.org:8011/builders/mlir-windows/builds/5945
https://buildkite.com/llvm-project/llvm-master-build/builds/780
2020-08-07 09:36:49 +02:00
biplmish cce1b0e891 [PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D84622
2020-08-07 01:02:29 -05:00
QingShan Zhang 55de46f3b2 [PowerPC] Support constrained fp operation for setcc
The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281.
This patch is trying to add the support for PowerPC backend.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D81727
2020-08-07 05:16:36 +00:00
Vitaly Buka 0b2616a804 [StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
2020-08-06 19:10:33 -07:00
Vitaly Buka 5c6d9b2bbf [LTO,NFC] Skip generateParamAccessSummary when empty
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85182
2020-08-06 19:01:19 -07:00
Arthur Eubanks 72c95b2213 [NewPM] Add callback for skipped passes
Parallel to https://reviews.llvm.org/D84772.

Will use this for printing when a pass is skipped.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85478
2020-08-06 18:58:59 -07:00
Matt Arsenault 1ad051dd8c GlobalISel: Implement lower for G_INSERT_VECTOR_ELT 2020-08-06 19:29:17 -04:00
Snehasish Kumar 8d943a928d [NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.

Differential Revision: https://reviews.llvm.org/D85380
2020-08-06 13:12:06 -07:00
Matt Arsenault e00201539f GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT
Use the same basic strategy as LegalizeVectorTypes. Try to index into
smaller pieces if there's a constant index, and otherwise fall back to
a stack temporary.
2020-08-06 14:33:16 -04:00
Matt Arsenault 1a0c0944c6 AMDGPU: Define raw/struct variants of buffer atomic fadd
Somehow the new FP atomic buffer intrinsics ended up using the legacy
style for buffer intrinsics.
2020-08-06 13:36:19 -04:00
Simon Pilgrim b7b1a38d41 PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI.
We already need to include raw_ostream.h, also add missing StringRef.h implicit dependency.
2020-08-06 16:31:56 +01:00
Sanjay Patel 60f2c6a94c [PatternMatch] allow intrinsic form of min/max with existing matchers
I skimmed the existing users of these matchers and don't see any problems
(eg, the caller assumes the matched value was a select instruction without checking).

So I think we can generalize the matching to allow the new intrinsics or the cmp+select idioms.

I did not find any unit tests for the matchers, so added some basics there. The instsimplify
tests are adapted from existing tests for the cmp+select pattern and cover the folds in
simplifyICmpWithMinMax().

Differential Revision: https://reviews.llvm.org/D85230
2020-08-06 10:50:24 -04:00
Raphael Isemann 1de43bd6df Revert "PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI."
This reverts commit 87c5437afd.

The commit includes several headers in the middle of a function, which
breaks pretty much everything.
2020-08-06 15:15:43 +02:00
Simon Pilgrim 3d10050e37 BitstreamRemarkParser.h - remove unnecessary includes. NFCI.
Remove unused includes, moving to the lib header or cpp file as necessary.
2020-08-06 13:17:53 +01:00
Simon Pilgrim 55ead5bfff Fix include sorting order. NFC 2020-08-06 11:46:53 +01:00
Simon Pilgrim 87c5437afd PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI.
We already need to include raw_ostream.h, also add missing StringRef.h and cstdint implicit dependencies.

Remove unnecessary includes from PDBExtras.cpp
2020-08-06 11:28:42 +01:00
David Green 745bf6cf44 [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-06 10:10:50 +01:00
Roman Lebedev 5060f5682b
[InstCombine] (-NSW x) s> x --> x s< 0 (PR39480)
Name: (-x) s> x  -->  x s< 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sgt i8 %neg_x, %x
  =>
%r = icmp slt i8 %x, 0

https://rise4fun.com/Alive/ZslD

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Xing GUO 4357986b41 [DWARFYAML][debug_info] Pull out dwarf::FormParams from DWARFYAML::Unit.
Unit.Format, Unit.Version and Unit.AddrSize are replaced with
dwarf::FormParams in D84496 to get rid of unnecessary functions
getOffsetSize() and getRefSize(). However, that change makes it
difficult to make AddrSize optional (Optional<uint8_t>). This change
pulls out dwarf::FormParams from DWARFYAML::Unit and use it as a helper
struct in DWARFYAML::emitDebugInfo().

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D85296
2020-08-06 16:39:00 +08:00
Craig Topper 504a197fe5 [X86] Rename X86::getImpliedFeatures to X86::updateImpliedFeatures and pass clang's StringMap directly to it.
No point in building a vector of StringRefs for clang to apply to the
StringMap. Just pass the StringMap and modify it directly.
2020-08-06 00:20:46 -07:00
Petr Hosek 1adc494bce [CMake] Simplify CMake handling for zlib
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.

This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.

Differential Revision: https://reviews.llvm.org/D79219
2020-08-05 16:07:11 -07:00
Greg Clayton e1de85f9f4 Add verification for DW_AT_decl_file and DW_AT_call_file.
LTO builds have been creating invalid DWARF and one of the errors was a file index that was out of bounds. "llvm-dwarfdump --verify" will check all file indexes for line tables already, but there are no checks for the validity of file indexes in attributes.

The verification will verify if there is a DW_AT_decl_file/DW_AT_call_file that:
- there is a line table for the compile unit
- the file index is valid
- the encoding is appropriate

Tests are added that test all of the above conditions.

Differential Revision: https://reviews.llvm.org/D84817
2020-08-05 15:30:13 -07:00
Rahman Lavaee 20a568c29d [Propeller]: Use a descriptive temporary symbol name for the end of the basic block.
This patch changes the functionality of AsmPrinter to name the basic block end labels as LBB_END${i}_${j}, with ${i} being the identifier for the function and ${j} being the identifier for the basic block. The new naming scheme is consistent with how basic block labels are named (.LBB${i}_{j}), and how function end symbol are named (.Lfunc_end${i}) and helps to write stronger tests for the upcoming patch for BB-Info section (as proposed in https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). The end label is used with basicblock-labels (BB-Info section in future) and basicblock-sections to compute the size of basic blocks and basic block sections, respectively. For BB sections, the section containing the entry basic block will not have a BB end label since it already gets the function end-label.
This label is cached for every basic block (CachedEndMCSymbol) like the label for the basic block (CachedMCSymbol).

Differential Revision: https://reviews.llvm.org/D83885
2020-08-05 13:17:19 -07:00