Commit Graph

66647 Commits

Author SHA1 Message Date
Matt Arsenault ea23b6428b AMDGPU: Be explicit about denormal mode in MIR tests
Start checking the machine function in GlobalISel instead of the
target directly.

This temporarily breaks fcanonicalize selection in GlobalISel.
2019-11-19 19:55:43 +05:30
dfukalov 6fd11b14f6 [AMDGPU] Tune inlining parameters for AMDGPU target (part 2)
Summary:
Most of IR instructions got better code size estimations after commit 47a5c36b.
So default parameters values should be updated to improve inlining and
unrolling for the target.

Reviewers: rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70391
2019-11-19 16:33:16 +03:00
Roman Lebedev 6de85095ed
[NFC][X86] Fixup comment in CodeGen/X86/cmov.ll
As noted in post-commit review for
https://reviews.llvm.org/D59035#inline-631659
2019-11-19 16:24:07 +03:00
Simon Pilgrim fed8c06892 [ARM] Regenerate vector lane store tests 2019-11-19 13:18:44 +00:00
Simon Pilgrim c7f85f3a84 [PowerPC] Regenerate vsx_insert_extract_le.ll tests 2019-11-19 13:18:44 +00:00
David Bozier 6baec97127 [llvm-objdump] Print relocation addends in hexadecimal
Summary: Matches GNU objdump. Makes debugging easier for me as I'm working out addresses from symbol+addend, so it would be good to be calculating in a single format.

Reviewers: MaskRay, grimar, jhenderson, bd1976llvm

Reviewed By: jhenderson

Subscribers: sdardis, jrtc27, atanasyan, rupprecht, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69997
2019-11-19 12:27:18 +00:00
Simon Pilgrim bbf4af3109 [X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization infinite loops (PR43971)
As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur.

At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
2019-11-19 11:55:44 +00:00
Thomas Preud'homme a89ca4ae17 Fix PR44001: assert failure in getFunctionLocalOffsetAfterInsn
Summary:
Assert in getFunctionLocalOffsetAfterInsn() fails when processing a call
MachineInstr inside a bundle and compiling with debug info. This is
because labels are added by DwarfDebug::beginInstruction() which is
called for each top-level MI by EmitFunctionBody()'s for-loop iteration
but constructCallSiteEntryDIEs() which calls
getFunctionLocalOffsetAfterInsn() iterates over all MIs.

This commit modifies constructCallSiteEntryDIEs() to get the associated
bundle MI for call MIs inside a bundle and use that to when calling
getFunctionLocalOffsetAfterInsn() and getLabelAfterInsn(). It also skips
loop iterations for bundle MIs since the loop statements are concerned
with debug info for each physical instructions and bundles represent a
group of instructions. It also fix the comment about PCAddr since the
code is getting the return address and not the call address.

Reviewers: dstenb, vsk, aprantl, djtodoro, dblaikie, NikolaPrica

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70293
2019-11-19 11:23:11 +00:00
evgeny 4ef9315c4b [ThinLTO] Make ValueInfo::operator bool() explicit
Differential revision: https://reviews.llvm.org/D70383
2019-11-19 12:46:09 +03:00
Sam Parker d43913ae38 [ARM][MVE] Enable narrow vectors for tail pred
Remove the restriction, from the mve tail predication pass, that the
all masked vectors instructions need to be 128-bits. This allows us
to supported extending loads and truncating stores.

Differential Revision: https://reviews.llvm.org/D69946
2019-11-19 08:51:12 +00:00
Sam Parker 8978c12b39 [ARM][MVE] Tail predication conversion
This patch modifies ARMLowOverheadLoops to convert a predicated
vector low-overhead loop into a tail-predicatd one. This is currently
a very basic conversion, with the following restrictions:
- Operates only on single block loops.
- The loop can only contain a single vctp instruction.
- No other instructions can write to the vpr.
- We only allow a subset of the mve instructions in the loop.

TODO: Pass the number of elements, not the number of iterations to
dlstp/wlstp.

Differential Revision: https://reviews.llvm.org/D69945
2019-11-19 08:22:18 +00:00
Paweł Bylica d593292f04
[X86] Add more addcarry tests
Summary: More addcarry tests for incoming https://reviews.llvm.org/D70079.

Reviewers: davezarzycki, RKSimon, spatel, craig.topper

Reviewed By: spatel

Subscribers: craig.topper, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70237
2019-11-19 08:59:29 +01:00
Matt Arsenault b337bce871 AMDGPU: Split test functions to avoid dependency on subtarget
Prepare this test for moving tthe denormal setting out of the
subtarget features.
2019-11-19 11:12:13 +05:30
Matt Arsenault 6f06eda070 bugpoint: Add option to disable attribute removal
This takes a long time and never reduces anything useful for me
(e.g. I've been waiting for 3 hours on a testcase and it hasn't found
any attributes to remove yet). This should probably start by assuming
no attributes matter, and adding back.
2019-11-19 11:11:00 +05:30
Leonard Chan 66b6b92765 Revert "implement printing out raw section data of xcoff objectfile for llvm-objdump"
This reverts commit 8f8a9f3437.

Reverting since this patch seems to break a lot of llvm buildbots.
2019-11-18 20:05:57 -08:00
Steven Wu e84468c1f1 [llvm-cxxfilt] Improve strip-underscore behavior
Summary:
For platform that uses macho format, c++filt should be stripping the
leading underscore by default. Introduce the binutil compatible "-n"
option to control strip-undercore behaivor together with the existing
"-_" option and fallback to system default if none of them are set.

rdar://problem/57173514

Reviewers: compnerd, erik.pilkington, dexonsmith, mattd

Reviewed By: compnerd, erik.pilkington

Subscribers: jkorous, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70250
2019-11-18 15:05:41 -08:00
Teresa Johnson aeca47fa0f ThinLTO: Fix assembler to emit alwaysInline in the summary
Summary: The earlier commit (https://reviews.llvm.org/D70014) missed this one : If Always_Inline happens to be the only entry in FuncFlags, then the assembler will not print it in the summary.

Patch by Bharathi Seshadri <bseshadr@cisco.com>

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70323
2019-11-18 15:02:13 -08:00
Eric Christopher 6f1cc4151a Temporarily revert "[SLP] fix miscompile on min/max reductions with extra uses (PR43948)"
as it causes an ICE on valid. A testcase was followed up on the original thread.

This reverts commit a3e61946c5.
2019-11-18 14:41:37 -08:00
diggerlin 5e0a4eddac Adding a test case for read-only data assembly writing for aix
SUMMARY:

Adding a test case  for read-only data assembly writing for aix

Reviewers: daltenty,Xiangling_Liao
Subscribers: rupprecht, seiyai,hiraditya

Differential Revision: https://reviews.llvm.org/D70182
2019-11-18 17:07:13 -05:00
Sanjay Patel b763924bd0 [SLP] reduce duplicated check lines in tests; NFC 2019-11-18 17:03:07 -05:00
Stefan Pintilie 6512473cee [PowerPC] Improve float vector gather codegen
This patch aims to improve the code generation for float vector gather on POWER9.
Patterns have been implemented to utilize instructions that deliver improved
performance.

Patch by: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D62908
2019-11-18 15:53:32 -06:00
diggerlin 8f8a9f3437 implement printing out raw section data of xcoff objectfile for llvm-objdump
SUMMARY:
implement printing out raw section data of xcoff objectfile for llvm-objdump
and option -D --disassemble-all option for llvm-objdump

Reviewers: Sean Fertile
Subscribers: rupprecht, seiyai,hiraditya

Differential Revision: https://reviews.llvm.org/D70255
2019-11-18 15:24:55 -05:00
Craig Topper 6e20d70a69 [LegalizeDAG] Convert strict fp nodes to libcalls without losing the chain.
Previously we mutated the node and then converted it to a libcall. But this loses the chain information.

This patch keeps the chain, but unfortunately breaks tail call optimization as the functions involved in deciding if a node is in tail call position can't handle the chain. But correct ordering seems more important to be right.

Somehow the SystemZ tests improved. I looked at one of them and it seemed that we're handling the split vector elements in a different order and that made the copies work better.

Differential Revision: https://reviews.llvm.org/D70334
2019-11-18 11:24:08 -08:00
Philip Reames ad5a84c883 [LoopPred/WC] Use a dominating widenable condition to remove analyze loop exits
This implements a version of the predicateLoopExits transform from IndVarSimplify extended to exploit widenable conditions - and thus be much wider in scope of legality. The code structure ends up being almost entirely different, so I chose to duplicate this into the LoopPredication pass instead of trying to reuse the code in the IndVars.

The core notions of the transform are as follows:

    If we have a widenable condition which controls entry into the loop, we're allowed to widen it arbitrarily. Given that, it's simply a *profitability* question as to what conditions to fold into the widenable branch.
    To avoid pass ordering issues, we want to avoid widening cases that would otherwise be dischargeable. Or... widen in a form which can still be discharged. Thus, we phrase the transform as selecting one analyzeable exit from the set of analyzeable exits to keep. This avoids creating pass ordering complexities.
    Since none of the above proves that we actually exit through our analyzeable exits - we might exit through something else entirely - we limit ourselves to cases where a) the latch is analyzeable and b) the latch is predicted taken, and c) the exit being removed is statically cold.

Differential Revision: https://reviews.llvm.org/D69830
2019-11-18 11:23:29 -08:00
Stefan Pintilie 9d93893914 [PowerPC] Test case for vector float gather on ppc64le and ppc64
Test case to verify that the expected code is generated for a
vector float gather based on the patterns in tablegen for big
and little endian cases.

Patch by: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D69443
2019-11-18 13:17:07 -06:00
Fangrui Song 63f0f54c89 [yaml2obj][test] Move tests to binary format specific subdirectories
Create COFF/, ELF/, and Minidump and move tests there.

Also

* Rename `*.test` to `*.yaml`
* For yaml2obj RUN lines, use `-o %t` instead of `> %t` for consistency.
  We still have tests that check stdout is the default output, e.g.
  multi-doc.test
* Update tests to consistently use `##` for comments.
  `#` is for RUN and CHECK lines.
* Merge symboless-relocation.yaml and invalid-symboless-relocation.yaml to ELF/relocation-implicit-symbol-index.test

Reviewed By: grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D70264
2019-11-18 09:06:14 -08:00
Pavel Labath fa54186056 [NFC] Clean up debug-names-verify-completeness.s test
This patch replaces the tabs by spaces and avoid the need for a
debug_str section by moving all strings inline. It also removes the
hardcoded DIE offsets in the test, which will simplify a follow-up
patch.
2019-11-18 16:33:29 +01:00
Russell Gallop aea7578fad [NFC] Fix test reserve_global_reg.ll after 2d739f9 2019-11-18 15:04:32 +00:00
Sam McCall d27a16eb39 Revert "[DWARF5]Addition of alignment atrribute in typedef DIE."
This reverts commit 423f541c1a, which
breaks llvm-c ABI.
2019-11-18 15:53:22 +01:00
Tim Northover dea8f3b0a4 arm64_32: support function return in FastISel. 2019-11-18 14:35:05 +00:00
Pavel Labath dca2b36ba0 Re-commit "DWARF location lists: Add section index dumping"
This reapplies c0f6ad7d1f with an
additional fix in test/DebugInfo/X86/constant-loclist.ll, which had a
slightly different output on windows targets. The test now accounts for
this difference.

The original commit message follows.

Summary:
As discussed in D70081, this adds the ability to dump section
names/indices to the location list dumper. It does this by moving the
range specific logic from DWARFDie.cpp:dumpRanges into the
DWARFAddressRange class.

The trickiest part of this patch is the backflip in the meanings of the
two dump flags for the location list sections.

The dumping of "raw" location list data is now controlled by
"DisplayRawContents" flag. This frees up the "Verbose" flag to be used
to control whether we print the section index. Additionally, the
DisplayRawContents flag is set for section-based dumps whenever the
--verbose option is passed, but this is not done for the "inline" dumps.

Also note that the index dumping currently does not work for the DWARF
v5 location lists, as the parser does not fill out the appropriate
fields. This will be done in a separate patch.

Reviewers: dblaikie, probinson, JDevlieghere, SouraVX

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, arphaman, aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70227
2019-11-18 15:30:10 +01:00
Dmitry Preobrazhensky edd9f70163 [AMDGPU][MC][GFX10] Enabled v_movrel*[sdwa|dpp|dpp8] opcodes
See https://bugs.llvm.org/show_bug.cgi?id=43712

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D70170
2019-11-18 17:23:40 +03:00
Simon Pilgrim c070a27acc Revert rGc0f6ad7d1f3c : "DWARF location lists: Add section index dumping"
This reverts commit c0f6ad7d1f to fix the buildbots.
2019-11-18 13:26:51 +00:00
Aaron Smith dbb64b39b8 Fix a print error found while testing llvm-objcopy
A value was not printed as hex. This updates the output and test cases.
2019-11-18 13:07:35 +00:00
czhengsz 1ce5fcda17 [PowerPC] [NFC] add IR testcases for folding rlwinma. 2019-11-18 07:43:30 -05:00
Simon Pilgrim b68191e729 [X86][SSE] Add test for extractelement with multiple uses
Mentioned in D70267
2019-11-18 11:36:14 +00:00
Simon Cook eedb964822 [RISCV] Add assembly mnemonic spell checking
Summary:
This allows the assembler to suggest alternative assembly mnemonics when
an invalid one has been provided.

Reviewers: asb, lenary, lewis-revill

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69894
2019-11-18 10:58:00 +00:00
QingShan Zhang 03e7fb2e07 [NFC][Test] Add the vavg test for PowerPC 2019-11-18 10:41:47 +00:00
Simon Tatham f4f77aa53e [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.
If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.

  mve_pred16_t pred = vcmpeqq(v1, v2);
  v_out = vaddq_m(v_inactive, v3, v4, pred);

then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.

To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:

  vpt.u16 eq, q1, q2
  vaddt.u16 q0, q3, q4

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70313
2019-11-18 10:39:30 +00:00
Anna Welker 2d739f98d8 [ARM] Allocatable Global Register Variables for ARM
Provides support for using r6-r11 as globally scoped
      register variables. This requires a -ffixed-rN flag
      in order to reserve rN against general allocation.

      If for a given GRV declaration the corresponding flag
      is not found, or the the register in question is the
      target's FP, we fail with a diagnostic.

      Differential Revision: https://reviews.llvm.org/D68862
2019-11-18 10:07:37 +00:00
Pavel Labath c0f6ad7d1f DWARF location lists: Add section index dumping
Summary:
As discussed in D70081, this adds the ability to dump section
names/indices to the location list dumper. It does this by moving the
range specific logic from DWARFDie.cpp:dumpRanges into the
DWARFAddressRange class.

The trickiest part of this patch is the backflip in the meanings of the
two dump flags for the location list sections.

The dumping of "raw" location list data is now controlled by
"DisplayRawContents" flag. This frees up the "Verbose" flag to be used
to control whether we print the section index. Additionally, the
DisplayRawContents flag is set for section-based dumps whenever the
--verbose option is passed, but this is not done for the "inline" dumps.

Also note that the index dumping currently does not work for the DWARF
v5 location lists, as the parser does not fill out the appropriate
fields. This will be done in a separate patch.

Reviewers: dblaikie, probinson, JDevlieghere, SouraVX

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, arphaman, aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70227
2019-11-18 10:50:22 +01:00
James Clarke 816ff985f5 [Sparc] Fix "Cannot select" error for AtomicFence on 32-bit V9
Summary:
This also adds testing of 32-bit V9 atomic lowering, splitting the
64-bit-only tests out into their own file.

Reviewers: venkatra, jyknight

Reviewed By: jyknight

Subscribers: hiraditya, fedor.sergeev, jfb, llvm-commits, glaubitz

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69352
2019-11-18 09:45:10 +00:00
Craig Topper af435286e5 [LegalizeTypes][X86] Add support for expanding the result type of STRICT_LLROUND and STRICT_LLRINT.
This doesn't handle softening the input type, but we don't handle
softening any of the strict nodes yet. Skipping that made it easy
to reuse an existing function for creating a libcall from a node
with a chain.
2019-11-17 20:03:05 -08:00
czhengsz a0337d269b [PowerPC] extend PPCPreIncPrep Pass for ds/dq form
Now, PPCPreIncPrep pass changes a loop to update form and update all load/store
with same base accordingly. We can do more for load/store with same base, for
example, convert load/store with same base to ds/dq form.

Reviewed by: jsji

Differential Revision: https://reviews.llvm.org/D67088
2019-11-17 21:38:43 -05:00
Sanjay Patel 5d67d81f48 [InstCombine] prevent crashing/assert on shift constant expression (PR44028)
The binary operator cast implies an instruction, but the matcher for shift does not:
https://bugs.llvm.org/show_bug.cgi?id=44028
2019-11-17 17:31:09 -05:00
Florian Hahn 8eeabbaf5d [ConstantFold] Handle identity folds at top of ConstantFoldBinaryInst
Currently we miss folds with undef and identity values for binary ops
that do not fold to undef in general.

We can generalize the identity simplifications and do them before
checking for undef in particular.

Alive checks:
 * OR - https://rise4fun.com/Alive/8OsK
 * AND - https://rise4fun.com/Alive/e3tE

This will also allow us to remove some now redundant cases throughout
the function, but I would like to do this as follow-up. That should make
tracking down potential issues easier.

Reviewers: spatel, RKSimon, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D70169
2019-11-17 21:30:14 +00:00
Florian Hahn 28c183859a [ConstantFold] Add some tests for binops with constants and undefs.
Precommit tests for D70169.
2019-11-17 21:10:45 +00:00
Stefan Stipanovic a516fbac52 [Attributor] Use nofree argument attribute for heap-to-stack conversion
Reviewers: jdoerfert, uenoku

Subscribers:

Differential Revision: https://reviews.llvm.org/D70140
2019-11-17 21:35:04 +01:00
Sanjay Patel ebf9bf2cbc [SimplifyCFG] propagate fast-math-flags (FMF) from phi to select
Similar to/extension of D70208 (rGee0882bdf866), but this one
may finally allow closing motivating bugs.

This is another step towards having FMF apply only to FP values
rather than those + fcmp. See PR38086 for one of the original
discussions/motivations:
https://bugs.llvm.org/show_bug.cgi?id=38086

And the test here is derived from PR39535:
https://bugs.llvm.org/show_bug.cgi?id=39535

Currently, we lose FMF when converting any phi to select in
SimplifyCFG. There are a small number of similar changes needed
to correct within SimplifyCFG, so it should be quick to patch
this pass up.

FMF was extended to select and phi with:
D61917
D67564
2019-11-17 11:23:44 -05:00
Sanjay Patel 23f736059c [SimplifyCFG] add fast-math-flags to tests for better coverage; NFC
The conversion to select fails to propagate FMF.
2019-11-17 10:37:42 -05:00
Sanjay Patel f5870b0f36 [SimplifyCFG] add tests for possible FP speculative select; NFC
It doesn't seem that there are any perf/param knobs that can be turned
to create selects for the FP variants of the tests, but that may not
always be true in the future. If it changes, we should propagate FMF.
2019-11-17 10:27:47 -05:00
David Green 08390c52a2 [InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.sat
Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats.

Differential Revision: https://reviews.llvm.org/D69753
2019-11-17 10:45:11 +00:00
David Green 03fce6b12e [InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.sat
This adds to D69245, adding extra signed patterns for folding from a
sadd_with_overflow to a sadd_sat. These are more complex than the
unsigned patterns, as the overflow can occur in either direction.

For the add case, the positive overflow can only occur if both of the
values are positive (same for both the values being negative). So there
is an extra select on whether to use the positive or negative overflow
limit.

Differential Revision: https://reviews.llvm.org/D69252
2019-11-17 10:42:39 +00:00
David Green 7bed2cb853 [InstCombine] Add extra tests for overflow_to_sat.ll. NFC 2019-11-17 10:34:28 +00:00
Aditya Nandakumar cc6b853901 [MIRNamer]: Make the check lines in the test robust with regex.
Previously we were checking for specific hashes. Make it check for
regexes.

Should fix failure caused by: 7276868556
2019-11-16 22:58:45 -08:00
Sourabh Singh Tomar 423f541c1a [DWARF5]Addition of alignment atrribute in typedef DIE.
This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.

Patch by Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx

Differential Revision: https://reviews.llvm.org/D70111
2019-11-16 21:56:53 +05:30
James Y Knight bf142fc433 MCObjectStreamer: assign MCSymbols in the dummy fragment to offset 0.
In MCObjectStreamer, when there is no current fragment, initially
symbols are created in a "pending" state and assigned to a dummy
empty fragment.

Previously, they were not being assigned an offset, and thus
evaluateAbsolute would fail if trying to evaluate an expression 'a -
b', where both 'a' and 'b' were in this pending state.

Also slightly refactored the EmitLabel overload which takes an
MCFragment for clarity.

Fixes: https://llvm.org/PR41825

Differential Revision: https://reviews.llvm.org/D70062
2019-11-16 09:52:07 -05:00
Shiva Chen cf6cf0cd14 [RISCV] Handle variable sized objects with the stack need to be realigned
Differential Revision: https://reviews.llvm.org/D68979
2019-11-16 12:39:53 +08:00
David Blaikie 77cfcd7509 DebugInfo: Use loclistx for DWARFv5 location lists to reduce the number of relocations
This only implements the non-dwo part, but loclistx is necessary to use
location lists in DWARFv5, so it's a precursor to that work - and
generally reduces relocations (only using one reloc, then
indexes/relative offsets for all location list references) in non-split
DWARF.
2019-11-15 18:51:13 -08:00
David Blaikie d295087639 DebugInfo: Templatize rnglist header parsing to setup for reuse with loclist header parsing 2019-11-15 16:23:02 -08:00
Thomas Lively 194d7ec081 [WebAssembly] Fix miscompile of select with and
Summary:
Rolls back the remaining bad optimizations introduced in
eb15d00193. Some of them were already rolled back in e661f946a7 and
this finishes the job.

Fixes https://bugs.llvm.org/show_bug.cgi?id=44012.

Reviewers: dschuff, aheejin

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70347
2019-11-15 16:22:01 -08:00
Quentin Colombet 304abde077 [GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector
LLVM IR of 1-element vectors get lower into scalar in GISel. As a
result, shuffle vector may also produce a scalar.

This patch teaches the shuffle combiner how to deal with scalars when
they are in the destination type of a shuffle vector.

For now, we just support the easy case where this can be lowered to
a plain copy. For other cases, we leave the shuffle vector as is.

This type of IR are seen in O0 pipelines. E.g., as produced with
SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c.

rdar://problem/57198904
2019-11-15 13:54:33 -08:00
Sanjay Patel ee0882bdf8 [SimplifyCFG] propagate fast-math-flags (FMF) from phi to select
This is another step towards having FMF apply only to FP values
rather than those + fcmp. See PR38086 for one of the original
discussions/motivations:
https://bugs.llvm.org/show_bug.cgi?id=38086

And the test here is derived from PR39535:
https://bugs.llvm.org/show_bug.cgi?id=39535

Currently, we lose FMF when converting any phi to select in
SimplifyCFG. There are a small number of similar changes needed
to correct within SimplifyCFG, so it should be quick to patch
this pass up.

FMF was extended to select and phi with:
D61917
D67564

Differential Revision: https://reviews.llvm.org/D70208
2019-11-15 16:14:35 -05:00
Simon Atanasyan 6108eb4e5c [mips] Enable `la` pseudo instruction on 64-bit arch.
This patch makes LLVM compatible with GAS. It accepts `la` pseudo
instruction on arch with 64-bit pointers and just shows a warning.

Differential Revision: https://reviews.llvm.org/D70202
2019-11-15 23:38:14 +03:00
Simon Atanasyan 0287efb891 [mips] Do not emit R_MIPS_JALR for sym+offset in case of O32 ABI
O32 ABI uses relocations in REL format. Relocation's addend is written
in place. R_MIPS_JALR relocation points to the `jalr` instruction which
does not have a place to store the relocation addend. So it's impossible
to save non-zero "offset". This patch blocks emission of `R_MIPS_JALR`
relocations in such cases.

Differential Revision: https://reviews.llvm.org/D70201
2019-11-15 23:38:14 +03:00
Rachel Craik f897d087d0 [LoopCacheAnalysis]: Fix assertion failure during cost computation
Ensure the stride and trip count have the same type before multiplying them during reference cost calculation

Reviewed By: jdoefert

Differential Revision: https://reviews.llvm.org/D70192
2019-11-15 14:56:26 -05:00
Francesco Petrogalli d6de5f12d4 [SVFS] Inject TLI Mappings in VFABI attribute.
This patch introduces a function pass to inject the scalar-to-vector
mappings stored in the TargetLIbraryInfo (TLI) into the Vector
Function ABI (VFABI) variants attribute.

The test is testing the injection for three vector libraries supported
by the TLI (Accelerate, SVML, MASSV).

The pass does not change any of the analysis associated to the
function.

Differential Revision: https://reviews.llvm.org/D70107
2019-11-15 18:42:56 +00:00
Fangrui Song 28a5dc7fc5 [llvm-objcopy][MachO] Implement --redefine-sym and --redefine-syms
Similar to D46029 (ELF) and D70036 (COFF), but for MachO.
Note, when --strip-symbol (not implemented for MachO) is also specified,
--redefine-sym executes before --strip-symbol.

Reviewed By: jhenderson, seiya

Differential Revision: https://reviews.llvm.org/D70212
2019-11-15 10:05:36 -08:00
Vedant Kumar 67c416dc9a [DebugInfo] Allow spill slots in call site parameter descriptions
Allow call site paramter descriptions to reference spill slots. Spill
slots are not visible to high-level LLVM IR, so they can safely be
referenced during entry value evaluation (as they cannot be clobbered by
some other function).

This gives a 5% increase in the number of call site parameter DIEs in an
LTO x86_64 build of the xnu kernel.

This reverts commit eb4c98ca3d (
[DebugInfo] Exclude memory location values as parameter entry values),
effectively reintroducing the portion of D60716 which dealt with memory
locations (authored by Djordje, Nikola, Ananth, and Ivan).

This partially addresses llvm.org/PR43343. However, not all memory
operands forwarded to callees live in spill slots. In the xnu build, it
may be possible to use an escape analysis to increase the number of call
site parameter by another 15% (more details in PR43343).

Differential Revision: https://reviews.llvm.org/D70254
2019-11-15 09:55:36 -08:00
Aditya Nandakumar 7276868556 [MirNamer][Canonicalizer]: Perform instruction semantic based renaming
https://reviews.llvm.org/D70210

Previously:

Due to sensitivity of the algorithm with gaps, and extra instructions,
when diffing, often we see naming being off by a few. Makes the diff
unreadable even for tests with 7 and 8 instructions respectively.
Naming can change depending on candidates (and order of picking
candidates). Suddenly if there's one extra instruction somewhere, the
entire subtree would be named completely differently.
No consistent naming of similar instructions which occur in different
functions. If we try to do something like count the frequency
distribution of various differences across suite, then the above
sensitivity issues are going to result in poor results.
Instead:

Name instruction based on semantics of the instruction (hash of the
opcode and operands). Essentially for a given instruction that occurs in
any module/function it'll be named similarly (ie semantic). This has
some nice properties
Can easily look at many instructions and just check the hash and if
they're named similarly, then it's the same instruction. Makes it very
easy to spot the same instruction both multiple times, as well as across
many functions (useful for frequency distribution).
Independent of traversal/candidates/depth of graph. No need to keep
track of last index/gaps/skip count etc.
No off by few issues with diffs. I've tried the old vs new
implementation in files ranging from 30 to 700 instructions. In both
cases with the old algorithm, diffs are a sea of red, where as for the
semantic version, in both cases, the diffs line up beautifully.
Simplified implementation of the main loop (simple iteration) , no keep
track of what's visited and not.
Handle collision just by incrementing a counter. Roughly
bb[N]_hash_[CollisionCount].
Additionally with the new implementation, we can probably avoid doing
the hoisting of instructions to various places, as they'll likely be
named the same resulting in differences only based on collision (ie
regardless of whether the instruction is hoisted or not/close to use or
not, it'll be named the same hash which should result in use of the
instruction be identical with the only change being the collision count)
which is very easy to spot visually.
2019-11-15 08:38:54 -08:00
Sergey Dmitriev 840c891a8c [llvm-objcopy][NFC] Use generated object file in COFF/add-section.test
Updated LIT test from D70205 to use generated object file with extended relocation table.

Differential Revision: https://reviews.llvm.org/D70269
2019-11-15 08:10:17 -08:00
Simon Pilgrim c3607f52b1 [X86][SSE] Add test for extractelement from volatile vector load
Mentioned in D70267
2019-11-15 15:59:33 +00:00
Simon Tatham b0c1900820 [ARM,MVE] Add reversed isel patterns for MVE `vcmp qN,rN`
Summary:
As well as vector/vector compare instructions, MVE also has a family
of comparisons taking a vector and a scalar, which compare every lane
of the vector against the same value. We generate those at isel time
using isel patterns that match `(ARMvcmp vector, (ARMvdup scalar))`.

This commit adds corresponding patterns for the operand-reversed form
`(ARMvcmp (ARMvdup scalar), vector)`, with condition codes swapped as
necessary. That way, we can still generate the vector/scalar compare
instruction if the IR happens to have been rearranged to put the
operands the other way round, which can happen in some optimization
phases. Previously, a vcmp the other way round was handled by emitting
a `vdup` instruction to //explicitly// replicate the scalar input into
a vector, and then doing a vector/vector comparison.

I haven't added a new test, because it turned out that several
existing tests were already exhibiting that failure mode. So just
updating the expected output in the existing MVE codegen tests
demonstrates what's been improved.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70296
2019-11-15 14:06:00 +00:00
Piotr Sobczak 02419ab5c7 [AMDGPU] Lower llvm.amdgcn.s.buffer.load.v3[i|f]32
Summary: Add lowering support for 32-bit vec3 variant of s.buffer.load intrinsic.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70118
2019-11-15 15:01:15 +01:00
Pavel Labath 0908093977 DWARFDebugLoc(v4): Add an incremental parsing function
Summary:
This adds a visitLocationList function to the DWARF v4 location lists,
similar to what already exists for DWARF v5. It follows the approach
outlined in previous patches (D69672), where the parsed form is always
stored in the DWARF v5 format, which makes it easier for generic code to
be built on top of that. v4 location lists are "upgraded" during
parsing, and then this upgrade is undone while dumping.

Both "inline" and section-based dumping is rewritten to reuse the
existing "generic" location list dumper. This means that the output
format is consistent for all location lists (the only thing one needs to
implement is the function which prints the "raw" form of a location
list), and that debug_loc dumping correctly processes base address
selection entries, etc.

The previous existing debug_loc functionality (e.g.,
parseOneLocationList) is rewritten on top of the new API, but it is not
removed as there is still code which uses them. This will be done in
follow-up patches, after I build the API to access the "interpreted"
location lists in a generic way (as that is what those users really
want).

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69847
2019-11-15 13:38:00 +01:00
Sjoerd Meijer 71327707b0 [ARM][MVE] tail-predication
This is a follow up of d90804d, to also flag fmcp instructions as instructions
that we do not support in tail-predicated vector loops.

Differential Revision: https://reviews.llvm.org/D70295
2019-11-15 11:01:13 +00:00
Petar Avramovic 1f559353a7 [MIPS GlobalISel] Select andi, ori and xori
Introduce IntImmLeaf version of PatLeaf immZExt16 for 32-bit immediates.
Change immZExt16 with imm32ZExt16 for andi, ori and xori.
This keeps same behavior for SDAG and allows for GlobalISel selectImpl
to select 'G_CONSTANT imm' + G_AND, G_OR, G_XOR into ANDi, ORi, XORi,
respectively, when 32-bit imm satisfies imm32ZExt16 predicate: zero
extending 16 low bits of imm is equal to imm.
Large number of test changes comes from zero extending of small types
which is transformed into 'and' with bitmask in legalizer.

Differential Revision:https://reviews.llvm.org/D70185
2019-11-15 11:41:25 +01:00
Petar Avramovic dda8e95540 [MIPS GlobalISel] Select addiu
Introduce IntImmLeaf version of PatLeaf immSExt16 for 32-bit immediates.
Change immSExt16 with imm32SExt16 for addiu.
This keeps same behavior for SDAG and allows for GlobalISel selectImpl
to select 'G_CONSTANT imm' + G_ADD into ADDIu when 32-bit imm satisfies
imm32SExt16 predicate: sign extending 16 low bits of imm is equal to imm.

Differential Revision: https://reviews.llvm.org/D70184
2019-11-15 11:36:13 +01:00
Mikael Holmen 1587c7e86f [Scalarizer] Treat values from unreachable blocks as undef
Summary:
When scalarizing PHI nodes we might try to examine/rewrite
InsertElement nodes in predecessors. If those predecessors
are unreachable from entry, then the IR in those blocks could
have unexpected properties resulting in infinite loops in
Scatterer::operator[].
By simply treating values originating from instructions in
unreachable blocks as undef we do not need to analyse them
further.

This fixes PR41723.

Reviewers: bjope

Reviewed By: bjope

Subscribers: bjope, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70171
2019-11-15 11:13:37 +01:00
Matt Arsenault 31479d868e AMDGPU: Change boolean content type to 0 or 1
The usage of target boolean checks is overly inflexible, since sext
and zext of a compare are equally cheap. The choice is arbitrary, but
using 0/1 to some degree is the choice of lower resistance since
that's what most targets use. This enables a few combines that don't
bother to support ZeroOrNegativeOneBooleanContent.
2019-11-15 13:43:47 +05:30
Matt Arsenault 69fcfb7d35 AMDGPU: Try to commute sub of boolean ext
Avoids another regression in a future patch.
2019-11-15 13:43:42 +05:30
Matt Arsenault bc276c6379 GlobalISel: Lower s1 source G_SITOFP/G_UITOFP 2019-11-15 13:37:20 +05:30
Seiya Nuta bc11830c6a
[llvm-objcopy][MachO] Implement --remove-section
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: rupprecht, jhenderson

Subscribers: jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66282
2019-11-15 14:20:11 +09:00
Wang, Pengfei 8723b95cef [WinEH] Fix the wrong alignment orientation during calculating EH frame.
Summary: This is a bug fix for further issues in PR43585.

Reviewers: rnk, RKSimon, craig.topper, andrew.w.kaylor

Subscribers: hiraditya, llvm-commits, annita.zhang

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70224
2019-11-15 09:42:38 +08:00
Alexey Bataev bfa32573bf Revert "Temporarily Revert:"
This reverts commit e511c4b0dff1692c267addf17dce3cebe8f97faa:

    Temporarily Revert:

     "[SLP] Generalization of stores vectorization."
     "[SLP] Fix -Wunused-variable. NFC"
     "[SLP] Vectorize jumbled stores."

after fixing the problem with compile time.
2019-11-14 16:38:20 -05:00
Vedant Kumar 1ee84e5ab2 [DebugInfo] Allow spill slots in call site parameter descriptions
Allow call site paramter descriptions to reference spill slots. Spill
slots are not visible to high-level LLVM IR, so they can safely be
referenced during entry value evaluation (as they cannot be clobbered by
some other function).

This gives a 5% increase in the number of call site parameter DIEs in an
LTO x86_64 build of the xnu kernel.

This reverts commit eb4c98ca3d (
[DebugInfo] Exclude memory location values as parameter entry values),
effectively reintroducing the portion of D60716 which dealt with memory
locations (authored by Djordje, Nikola, Ananth, and Ivan).

This partially addresses llvm.org/PR43343. However, not all memory
operands forwarded to callees live in spill slots. In the xnu build, it
may be possible to use an escape analysis to increase the number of call
site parameter by another 15% (more details in PR43343).

Differential Revision: https://reviews.llvm.org/D70254
2019-11-14 12:48:51 -08:00
Sergey Dmitriev 4d02263af0 [yaml2obj][COFF] Add support for extended relocation tables
Summary:
The tool does not correctly handle COFF sections with extended relocation tables (with IMAGE_SCN_LNK_NRELOC_OVFL bit set), this patch fixes this problem.

But I have cheated a bit in the test (to make it smaller) because extended relocation table is supposed to be used when the number of relocations exceeds 65534. Otherwise the test size would be pretty big.

Reviewers: jhenderson, MaskRay, mstorsjo

Reviewed By: mstorsjo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70251
2019-11-14 12:39:28 -08:00
Daniel Sanders b2839c442e [globalisel][irtanslator] The IRTranslator should preserve TBAA information 2019-11-14 12:11:27 -08:00
Sumanth Gundapaneni 7c7e368a7f [Pipeliner] Fix an assertion caused by iterator invalidation. 2019-11-14 13:08:06 -06:00
Sumanth Gundapaneni fdf1ae37cf [Hexagon] Validate the iterators before converting them to mux.
The conditional instructions that are translated to mux instructions
are deleted and the iterators to these deleted instructions are being
used later. This patch fixed this issue.
2019-11-14 13:01:16 -06:00
Sam Elliott 32d840d291 [RISCV] Use addi rather than add x0
Summary:
The RISC-V backend used to generate `add <reg>, x0, <reg>` in a few
instances. It seems most places no longer generate this sequence.

This is semantically equivalent to `addi <reg>, <reg>, 0`, but the
latter has the advantage of being noted to be the canonical instruction
to be used for moves (which microarchitectures can and should recognise
as such).

The changed testcases use instruction aliases - `mv <reg>, <reg>` is an
alias for `addi <reg>, <reg>, 0`.

Reviewers: luismarques

Reviewed By: luismarques

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70124
2019-11-14 18:43:38 +00:00
Matthew Voss 141bb5f308 Add support for multi-module bitcode files to llvm-dis
Summary:
This change allows llvm-dis to disassemble multi-module bitcode
files, including the associated module summary.

Reviewers: tejohnson, pcc, mehdi_amini

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70153
2019-11-14 10:40:41 -08:00
Sergey Dmitriev caa9493da8 [llvm-objcopy][COFF] Add support for extended relocation tables
Summary: This patch adds support for COFF objects with extended relocation tables to the llvm-objcopy tool.

Reviewers: jhenderson, MaskRay, mstorsjo, alexshap, rupprecht

Reviewed By: mstorsjo

Subscribers: jakehehrlich, abrachet, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70205
2019-11-14 10:31:50 -08:00
Luís Marques c6b09bff56 [RISCV] Fix wrong CFI directives
Summary: Removes CFI CFA directives that could incorrectly propagate
beyond the basic block they were inteded for. Specifically it removes
the epilogue CFI directives. See the branch_and_tail_call test for an
example of the issue. Should fix the stack unwinding issues caused by
the incorrect directives.

Reviewers: asb, lenary, shiva0217
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69723
2019-11-14 18:29:50 +00:00
Sanjay Patel ce371ec6d7 [InstCombine] regenerate test CHECKs; NFC
There's a discussion about changing a shufflevector
transform in:
https://bugs.llvm.org/show_bug.cgi?id=43958

It would protect against our current undef/poison
behavior, and these are all tests that could be affected.
2019-11-14 10:23:16 -05:00
Tim Northover 232cdb3d30 ARM: allow rewriting frame indexes for all prefetch variants.
For some reason we could handle PLD but not PLDW or PLI, but all of them can
potentially refer to the stack region (if weirdly for PLI).
2019-11-14 14:26:28 +00:00
Kerry McLaughlin f9dd03b135 [AArch64][SVE] Implement floating-point comparison & reduction intrinsics
Summary:
Adds intrinsics for the following:
 - fadda & faddv
 - fminv, fmaxv, fminnmv & fmaxnmv
 - facge & facgt
 - fcmp[eq|ge|gt|ne|uo]

Reviewers: sdesmalen, huntergr, dancgr, mgudim

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69858
2019-11-14 13:47:08 +00:00
Sjoerd Meijer cb47b87830 [LV] PreferPredicateOverEpilog respecting predicate loop hint
The vectoriser queries TTI->preferPredicateOverEpilogue to determine if
tail-folding is preferred for a loop, but it was not respecting loop hint
'predicate' that can disable this, which has now been added. This showed that
we were incorrectly initialising loop hint 'vectorize.predicate.enable' with 0
(i.e. FK_Disabled) but this should have been FK_Undefined, which has been
fixed.

Differential Revision: https://reviews.llvm.org/D70125
2019-11-14 13:10:44 +00:00
Kerry McLaughlin cd83d9ff5c [AArch64][SVE] Implement remaining floating-point arithmetic intrinsics
Summary:
Adds intrinsics for the following:
  - fabs & fneg
  - fexpa
  - frint[a|i|m|n|p|x|z]
  - frecpe, frecps & frecpx
  - fsqrt, frsqrte & frsqrts

Reviewers: huntergr, sdesmalen, dancgr, mgudim

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69800
2019-11-14 11:59:00 +00:00
Kerry McLaughlin f7848fd8f7 [AArch64][SVE] Implement additional floating-point arithmetic intrinsics
Summary:
Adds intrinsics for the following:
  - ftssel
  - fcadd, fcmla
  - fmla, fmls, fnmla, fnmls
  - fmad, fmsb, fnmad, fnmsb

Reviewers: sdesmalen, huntergr, dancgr, mgudim

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69707
2019-11-14 11:35:50 +00:00
QingShan Zhang bcb6829ee6 [NFC] Add one test for PowerPC to verify the sext_inreg for vector type. 2019-11-14 10:57:05 +00:00
Daniil Suchkov 4c9d0da838 Revert "[InstCombine] Fold PHIs with equal incoming pointers"
This reverts commit a2f6ae9abf.
It is reverted due to clang-cmake-armv7-selfhost buildbot failure.
2019-11-14 17:42:01 +07:00
Daniil Suchkov a2f6ae9abf [InstCombine] Fold PHIs with equal incoming pointers
This is a resubmission of bbb29738b5 that
was reverted due to clang tests failures. It includes the fix and
additional IR tests for the missed case.

Summary:
In case when all incoming values of a PHI are equal pointers, this
transformation inserts a definition of such a pointer right after
definition of the base pointer and replaces with this value both PHI and
all it's incoming pointers. Primary goal of this transformation is
canonicalization of this pattern in order to enable optimizations that
can't handle PHIs. Non-inbounds pointers aren't currently supported.

Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed By: apilipenko

Tags: #llvm

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D68128
2019-11-14 17:04:32 +07:00
Djordje Todorovic 2eb0862ed8 [AArch64][DebugInfo] Fix incorrect call site param value produced by MOVZXi
This resolves the problem with the truncation of the immediate operand.

Differential Revision: https://reviews.llvm.org/D70168
2019-11-14 11:02:35 +01:00
Pavel Labath eafe0cf5fa DWARFDebugLoclists: stricter base address handling
Summary:
This removes the use of zero as a base address in section-based dumping.
Although this will often be true for (unlinked) object files with a
single compile unit, it is not true in general. This means that
section-based dumping will not be able to resolve entries referencing
the base address (DW_LLE_offset_pair) -- it wasn't able to do that
correctly before either, but now it will be more explicit about it. One
exception to that is if the location list contains an explicit
DW_LLE_base_address entry -- in this case the dumper will pick it up,
and resolve subsequent entries normally.

The patch also removes the fallback to zero in the "inline" dumping in
case the compile unit does not contain a base address.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70115
2019-11-14 10:01:48 +01:00
Dimitry Andric 3db6783d8a Check result of emitStrLen before passing it to CreateGEP
Summary:
This fixes PR43081, where the transformation of `strchr(p, 0) -> p +
strlen(p)` can cause a segfault, if `-fno-builtin-strlen` is used.  In
that case, `emitStrLen` returns nullptr, which CreateGEP is not designed
to handle.  Also add the minimized code from the PR as a test case.

Reviewers: xbolva00, spatel, jdoerfert, efriedma

Reviewed By: efriedma

Subscribers: lebedev.ri, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70143
2019-11-14 08:04:36 +01:00
Stanislav Mekhanoshin 4fa44f989e [AMDGPU] Fixed dpp test. NFC. 2019-11-13 16:38:54 -08:00
Stanislav Mekhanoshin af7d4022c7 [AMDGPU] Fixed mfma-loop test. NFC. 2019-11-13 16:03:54 -08:00
Craig Topper f7e9d81a8e [X86] Don't set the operation action for i16 SINT_TO_FP to Promote just because SSE1 is enabled.
Instead do custom promotion in the handler so that we can still
allow i16 to be used with fp80. And f64 without sse2.
2019-11-13 14:07:56 -08:00
Sanjay Patel be08af8816 [SimplifyCFG] add test for select with FMF; NFC 2019-11-13 16:45:42 -05:00
Sanjay Patel a3e61946c5 [SLP] fix miscompile on min/max reductions with extra uses (PR43948)
The bug manifests as replacing a reduction operand with an undef
value.

The problem appears to be limited to cases where a min/max reduction
has extra uses of the compare operand to the select.

In the general case, we are tracking "ExternallyUsedValues" and
an "IgnoreList" of the reduction operations, but those may not apply
to the final compare+select in a min/max reduction.

For that, we use replaceAllUsesWith (RAUW) to ensure that the new
vectorized reduction values are transferred to all subsequent users.

Differential Revision: https://reviews.llvm.org/D70148
2019-11-13 15:57:35 -05:00
Dimitry Andric 597b77fb7f Add -disable-builtin option to opt
Summary:
The option allows to disable specific target library builtin functions,
instead of -disable-simplify-libcalls, which disables all of them.

This is a prerequisite for D70143, which fixes PR43081.

Reviewers: xbolva00, spatel, jdoerfert, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70193
2019-11-13 21:32:49 +01:00
Francis Visoiu Mistrih 3dfe4cf982 [dsymutil] Add -dump to llvm-bcanalyzer invocations 2019-11-13 12:27:26 -08:00
Simon Atanasyan 14d3162285 [mips] Add test to check ELF output for JAL XGOT expansion. NFC 2019-11-13 22:57:55 +03:00
Simon Atanasyan 3216d28449 [mips] Add tests to check `jal sym+offset`. NFC 2019-11-13 22:57:54 +03:00
Quentin Colombet de94cda81b [LiveInterval] Allow updating subranges with slightly out-dated IR
During register coalescing, we update the live-intervals on-the-fly.
To do that we are in this strange mode where the live-intervals can
be slightly out-of-sync (more precisely they are forward looking)
compared to what the IR actually represents.
This happens because the register coalescer only updates the IR when
it is done with updating the live-intervals and it has to do it this
way because updating the IR on-the-fly would actually clobber some
information on how the live-ranges that are being updated look like.

This is problematic for updates that rely on the IR to accurately
represents the state of the live-ranges. Right now, we have only
one of those: stripValuesNotDefiningMask.
To reconcile this need of out-of-sync IR, this patch introduces a
new argument to LiveInterval::refineSubRanges that allows the code
doing the live range updates to reason about how the code should
look like after the coalescer will have rewritten the registers.
Essentially this captures how a subregister index with be offseted
to match its position in a new register class.

E.g., let say we want to merge:
    V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>

We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
Put differently we align V2's sub3 with V1's sub1:
    V2: sub0 sub1 sub2 sub3
    V1: <offset>  sub0 sub1

This offset will look like a composed subregidx in the the class:
     V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
 =>  V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>

Now if we didn't rewrite the uses and def of V1, all the checks for V1
need to account for this offset to match what the live intervals intend
to capture.

Prior to this patch, we would fail to recognize the uses and def of V1
and would end up with machine verifier errors: No live segment at def.
This could lead to miscompile as we would drop some live-ranges and
thus, miss some interferences.

For this problem to trigger, we need to reach stripValuesNotDefiningMask
while having a mismatch between the IR and the live-ranges (i.e.,
we have to apply a subreg offset to the IR.)

This requires the following three conditions:
1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
2. An update with Tuple registers with a possibility to coalesce the
   subreg index: e.g., v1.dsub_1 == v2.dsub_3
3. Subreg liveness enabled.

looking at the IR to decide what is alive and what is not, i.e., calling
stripValuesNotDefiningMask.
coalescer maintains for the live-ranges information.

None of the targets that currently use subreg liveness (i.e., the targets
that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
and #2, so this patch also artificial enables subreg liveness for ARM,
so that a nice test case can be attached.
2019-11-13 11:17:56 -08:00
Michael Liao 2bf9b9a5a3 [TTI] Fix cast cost on vector types.
- Only split vector types when both src and dst types are splittable.
2019-11-13 13:54:07 -05:00
Francis Visoiu Mistrih 1ca85b3d33 [llvm-bcanalyzer] Don't dump the contents if -dump is not passed
With all the previous refactorings this slipped through and now we
always dump the contents of the bitcode files, even if -dump is not
passed.
2019-11-13 10:38:57 -08:00
Ahmed Bougacha 7313d7d618 [AArch64][v8.3a] Add missing imp-defs on RETA*.
RETA always implicitly uses LR, unlike RET which merely has an
alias that defaults it to LR.
Additionally, RETA implicitly uses SP as well, which it uses as
a discriminator to authenticate LR.

This isn't usually noticeable, because RET_ReallyLR is used in most
of the backend.  However, the post-RA scheduler, if enabled, will
cause miscompiles if the imp-uses are missing.

While there, fix a typo in the lone affected testcase.
2019-11-13 10:38:11 -08:00
Ahmed Bougacha 643ac6c042 [AArch64][v8.3a] Add LDRA '[xN]!' alias.
The instruction definition has been retroactively expanded to
allow for an alias for '[xN, 0]!' as '[xN]!'.
That wouldn't make sense on LDR, but does for LDRA.
2019-11-13 10:38:11 -08:00
Sanjay Patel 142cbe73e9 [SLP] improve test readability; NFC 2019-11-13 12:59:00 -05:00
Sanjay Patel 3d6b53980c [InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select
As noted by the FIXME comment, this is not correct based on our current FMF semantics.
We should be propagating FMF from the final value in a sequence (in this case the
'select'). So the behavior even without this patch is wrong, but we did not allow FMF
on 'select' until recently.

But if we do the correct thing right now in this patch, we'll inevitably introduce
regressions because we have not wired up FMF propagation for 'phi' and 'select' in
other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
better incremental way to make progress.

That said, the potential extra damage over the existing wrong behavior from this
patch is very limited. AFAIK, the only way to have different FMF on IR in the same
function is if we have LTO inlined IR from 2 modules that were compiled using
different fast-math settings.

As seen in the tests, we may actually see some improvements with this patch because
adding the FMF to the 'select' allows matching to min/max intrinsics that were
previously missed (in the common case, the 'fcmp' and 'select' should have identical
FMF to begin with).

Next steps in the transition:

    Make similar changes in instcombine as needed.
    Enable phi-to-select FMF propagation in SimplifyCFG.
    Remove dependencies on fcmp with FMF.
    Deprecate FMF on fcmp.

Differential Revision: https://reviews.llvm.org/D69720
2019-11-13 10:38:42 -05:00
Simon Pilgrim e84b7a5fe2 Remove commented out CHECK-NEXT to try and appease llvm-clang-x86_64-expensive-checks-win buildbot 2019-11-13 14:59:12 +00:00
Florian Hahn f7499011ca [InstCombine] Avoid moving ops that do restrict undef across shuffles.
I think we have to be a bit more careful when it comes to moving
ops across shuffles, if the op does restrict undef. For example, without
this patch, we would move 'and %v, <0, 0, -1, -1>' over a
'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first
2 lanes of the result are undef after the combine, but they really
should be 0, unless I am missing something.

For ops that do fold to undef on undef operands, the current behavior
should be fine. I've add conservative check OpDoesRestrictUndef, maybe
there's a better existing utility?

Reviewers: spatel, RKSimon, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D70093
2019-11-13 13:40:34 +00:00
Luís Marques c5b56caa32 Revert "[RISCV] Fix wrong CFI directives"
test/DebugInfo/RISCV/relax-debug-frame.ll wasn't properly updated.
2019-11-13 13:28:33 +00:00
Florian Hahn 70cc355f2f [InstCombine] Precommit shuffle tests for D70093. 2019-11-13 13:25:28 +00:00
Sjoerd Meijer d90804d26b [ARM][MVE] canTailPredicateLoop
This implements TTI hook 'preferPredicateOverEpilogue' for MVE.  This is a
first version and it operates on single block loops only. With this change, the
vectoriser will now determine if tail-folding scalar remainder loops is
possible/desired, which is the first step to generate MVE tail-predicated
vector loops.

This is disabled by default for now. I.e,, this is depends on option
-disable-mve-tail-predication, which is off by default.

I will follow up on this soon with a patch for the vectoriser to respect loop
hint 'vectorize.predicate.enable'. I.e., with this loop hint set to Disabled,
we don't want to tail-fold and we shouldn't query this TTI hook, which is
done in D70125.

Differential Revision: https://reviews.llvm.org/D69845
2019-11-13 13:24:33 +00:00
Luís Marques a5ce8bd715 [RISCV] Fix wrong CFI directives
Summary: Removes CFI CFA directives that could incorrectly propagate
beyond the basic block they were inteded for. Specifically it removes
the epilogue CFI directives. See the branch_and_tail_call test for an
example of the issue. Should fix the stack unwinding issues caused by
the incorrect directives.

Reviewers: asb, lenary, shiva0217
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69723
2019-11-13 13:06:15 +00:00
Simon Tatham a12f588ebb [ARM,MVE] Add intrinsics for contiguous load/stores.
This patch adds the ACLE intrinsics for all the MVE load and store
instructions not already handled by D69791. These ones don't need new
IR intrinsics, because they can be implemented in terms of standard
LLVM IR constructions.

Some of the load and store instructions access less than 128 bits of
memory, sign/zero extending each value to a wider vector lane on load
or truncating it on store. These are represented in IR by a load of a
shorter vector followed by a zext/sext, and conversely, a trunc
followed by a short store. Existing ISel patterns already recognize
those combinations and turn them into the right MVE instructions.

The predicated forms of all these instructions are represented in the
same way, except that the ordinary load/store operation is replaced
with the existing intrinsics @llvm.masked.{load,store}. These are
currently only code-generated as predicated MVE load/store
instructions if you give LLVM the `-enable-arm-maskedldst` option; so
I've done that in the LLVM codegen test. When we make that the
default, that option can be removed.

In the Tablegen backend, I've had to add a handful of extra support
features:

* We need to be able to make clang::Address objects out of a
  pointer and an alignment (previously we only needed these when the
  user passed us an existing one).

* We can now specify vector types that aren't 128 bits wide (for use
  in those intermediate values in IR), the parametrized type system
  can make one starting from two existing vector types (using the lane
  count of one and the element type of the other).

* I've added support for code generation of pointer casts, and for
  specifying LLVM types as operands to IRBuilder operations (for zext
  and sext, though I think they'll come in useful again).

* Now not all IR construction operations need to be specified as
  Builder.CreateFoo; some don't involve a Builder at all, and one
  passes it as a parameter to a tiny static helper function in
  CGBuiltin.cpp.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Subscribers: kristof.beyls, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70088
2019-11-13 12:47:00 +00:00
Hans Wennborg 6ea4775900 Revert 57dd4b0 "[ValueTracking] Allow context-sensitive nullness check for non-pointers"
This caused miscompiles of Chromium (https://crbug.com/1023818). The reduced
repro is small enough to fit here:

  $ cat /tmp/a.c
  unsigned char f(unsigned char *p) {
    unsigned char result = 0;
    for (int shift = 0; shift < 1; ++shift)
      result |= p[0] << (shift * 8);
    return result;
  }
  $ bin/clang -O2 -S -o - /tmp/a.c | grep -A4 f:
  f:                                      # @f
          .cfi_startproc
  # %bb.0:                                # %entry
          xorl    %eax, %eax
          retq

That's nicely optimized, but I don't think it's the right result :-)

> Same as D60846 but with a fix for the problem encountered there which
> was a missing context adjustment in the handling of PHI nodes.
>
> The test that caused D60846 to be reverted was added in e15ab8f277.
>
> Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
>
> Subscribers: hiraditya, bollu, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D69571

This reverts commit 57dd4b03e4.
2019-11-13 12:19:02 +01:00
Mirko Brkusanin fed17867cd [Mips] Add rematerialization support for ldi.fmt
Instruction ldi.fmt can be considered cheap enough to avoid spill and restore
of value that it produces since it's loaded from immediate.

Differential Revision: https://reviews.llvm.org/D69898
2019-11-13 11:33:52 +01:00
Simon Atanasyan 068db2ed4d [mips] Show an error if 64-bit target triple provided with 32-bit CPU
When a 64-bit triple is used emit an error if the CPU only supports
32-bit code.

Patch by Miloš Stojanović.

Differential Revision: https://reviews.llvm.org/D70018
2019-11-13 13:32:39 +03:00
Simon Atanasyan b3853d8526 [mips][test] Add Mips CPU tests. NFC
Adding tests check all available CPUs on Mips.

Patch by Miloš Stojanović.

Differential Revision: https://reviews.llvm.org/D70017
2019-11-13 13:32:39 +03:00
Daniil Suchkov cba4a27745 Temporarily revert "[InstCombine] Fold PHIs with equal incoming pointers"
Revert due to sanitizer-windows buildbot failure.

This reverts commit bbb29738b5.
2019-11-13 17:14:11 +07:00
David Stenberg 5e646ff530 [DebugInfo] Avoid creating entry values for clobbered registers
Summary:
Entry values are considered for parameters that have register-described
DBG_VALUEs in the entry block (along with other conditions).

If a parameter's value has been propagated from the caller to the
callee, then the parameter's DBG_VALUE in the entry block may be
described using a register defined by some instruction, and entry values
should not be emitted for the parameter, which can currently occur.
One such case was seen in the attached test case, in which the second
parameter, which is described by a redefinition of the first parameter's
register, would incorrectly get an entry value using the first
parameter's register. This commit intends to solve such cases by keeping
track of register defines, and ignoring DBG_VALUEs in the entry block
that are described by such registers.

In a RelWithDebInfo build of clang-8, the average size of the set was
27, and in a RelWithDebInfo+ASan build it was 30.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: djtodoro, vsk

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D69889
2019-11-13 11:10:47 +01:00
Sander de Smalen 3367686b4d [AArch64] Extend storeRegToStackSlot to spill SVE registers.
This patch allows the register allocator to spill SVE registers to the stack.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70082
2019-11-13 10:09:32 +00:00
Daniil Suchkov bbb29738b5 [InstCombine] Fold PHIs with equal incoming pointers
In case when all incoming values of a PHI are equal pointers, this
transformation inserts a definition of such a pointer right after
definition of the base pointer and replaces with this value both PHI and
all it's incoming pointers. Primary goal of this transformation is
canonicalization of this pattern in order to enable optimizations that
can't handle PHIs. Non-inbounds pointers aren't currently supported.

Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed By: apilipenko

Tags: #llvm

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D68128
2019-11-13 17:00:34 +07:00
Sander de Smalen 9a1c243aa5 [AArch64][SVE] Allocate locals that are scalable vectors.
This patch adds a target interface to set the StackID for a given type,
which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a
'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70080
2019-11-13 09:45:24 +00:00
Simon Tatham 5b9e4daef0 [ARM,MVE] Use VMOV.{S8,S16} for sign-extended extractelement.
MVE includes instructions that extract an 8- or 16-bit lane from a
vector and sign-extend it into the output 32-bit GPR. `ARMInstrMVE.td`
already included isel patterns to select those instructions in
response to the `ARMISD::VGETLANEs` selection-DAG node type. But
`ARMISD::VGETLANEs` was never actually generated, because the code
that creates it was conditioned on NEON only.

It's an easy fix to enable the same code for integer MVE, and now IR
that sign-extends the result of an extractelement (whether explicitly
or as part of the function call ABI) will use `vmov.s8` instead of
`vmov.u8` followed by `sxtb`.

Reviewers: SjoerdMeijer, dmgreen, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70132
2019-11-13 09:08:41 +00:00
joanlluch d384ad6b63 [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4)
Summary:
Replaces
```
unsigned getShiftAmountThreshold(EVT VT)
```
by

```
bool shouldAvoidTransformToShift(EVT VT, unsigned amount)
```
thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not.

Updates the MSP430 target with a custom implementation.

This continues  D69116, D69120, D69326 and updates them, so all of them must be committed before this.

Existing tests apply, a few more have been added.

Reviewers: asl, spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70042
2019-11-13 09:23:08 +01:00
Craig Topper a4b7613a49 [X86] Remove setOperationAction for FP_TO_SINT v8i16.
This is no longer needed after widening legalization as we
custom legalize v8i8 ourselves.

Added entries to the cost model, but bumped the cost slightly
to account for the truncate shuffle that wasn't costed before.
2019-11-12 22:45:52 -08:00
Matt Arsenault 9d7bccab66 AMDGPU: Extend add x, (ext setcc) combine to sub
This is the same as the add case, but inverts the operation type.

This avoids regressions in a future patch.
2019-11-13 07:13:58 +05:30
Matt Arsenault 4b47213951 AMDGPU: Switch backend default max workgroup size to 1024
Previously this would default to 256, not the maximum supported size
of 1024. Using a maximum lower than the hardware maximum requires
language runtimes to enforce this limit for correctness, which no
language has correctly done. Switch the default to the conservatively
correct maximum, and force frontends to opt-in to the more optimal 256
default maximum.

I don't really understand why the changes in occupancy-levels.ll
increased the computed occupancy, which I expected to decrease. I'm
not sure if these tests should be forcing the old maximum.
2019-11-13 07:11:02 +05:30
Matt Arsenault 25c5da5a42 AMDGPU Reduce reported maximum group size to 1024
While some targets allow encoding 2048, this was never tested or
supported.
2019-11-13 06:34:28 +05:30
Alina Sbirlea 793b42a454 [GlobalsAA] Reenable test. 2019-11-12 16:53:28 -08:00
Alina Sbirlea 92611da5bf Temporarily disable test. 2019-11-12 15:57:51 -08:00
Eric Christopher 7a3ad48d6d Temporarily Revert "Reapply [LVI] Normalize pointer behavior" as it's broken python 3.6.
Reverting to figure out if it's a problem in python or the compiler for now.

This reverts commit 885a05f48a.
2019-11-12 15:51:51 -08:00
Craig Topper 3e1aee2ba7 [X86] Don't consider v64i1 as a legal type unless v64i8 is also a legal type.
This avoids some nasty issues with argument passing and lowering of
arbitrary v64i8 shuffles.
2019-11-12 14:56:02 -08:00
Yonghong Song 166cdc0281 [BPF] generate BTF_KIND_VARs for all non-static globals
Enable to generate BTF_KIND_VARs for non-static
default-section globals which is not allowed previously.
Modified the existing test case to accommodate the new change.

Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and
VAR_GLOBAL_EXTERNAL.

Differential Revision: https://reviews.llvm.org/D70145
2019-11-12 14:34:08 -08:00
Alina Sbirlea db69f1b229 [GlobalsAA] Restrict ModRef result if any internal method has its address taken.
Summary:
If there are any internal methods whose address was taken, conclude there is nothing known in relation of any other internal method and a global.

Reviewers: nlopes, sanjoy.google

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69690
2019-11-12 14:24:56 -08:00