Commit Graph

24970 Commits

Author SHA1 Message Date
Michael Berg ca38254601 extend folding fsub/fadd to fneg for FMF
Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation

Reviewers: spatel, wristow, arsenm

Reviewed By: spatel

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D50195

llvm-svn: 339357
2018-08-09 17:00:03 +00:00
Bjorn Pettersson c8b782cec2 [MC] Remove PhysRegSize from MCRegisterClass
Summary:
The interface to get size and spill size of a register
was moved from MCRegisterInfo to TargetRegisterInfo over
a year ago. Afaik the old interface has bee around
to give out-of-tree targets a chance to adapt to the
new interface.

One problem with the old MCRegisterClass::PhysRegSize was that
it represented the size of a register as "size in bits" / 8.
So a register had to be a multiple of eight bits wide for the
size to be correct (and the byte size for the target needed to
be eight bits).

Reviewers: kparzysz, qcolombet

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47199

llvm-svn: 339350
2018-08-09 15:19:07 +00:00
Simon Pilgrim a9f95429d9 [TargetLowering] Add BuildSDIVPattern helper to BuildExactSDIV (NFCI).
As requested in D50392, pull the magic constant calculations out into a helper function.

llvm-svn: 339346
2018-08-09 13:56:04 +00:00
Sanjay Patel e47dc1a405 [DAGCombiner] loosen constraints for fsub+fadd fold
isNegatibleForFree() should not matter here (as the test diffs show)
because it's always a win to replace an fsub+fadd with fneg. The
problem in D50195 persists because either (1) we are doing these
folds in the wrong order or (2) we're missing another fold for fadd.

llvm-svn: 339299
2018-08-08 23:04:43 +00:00
Sanjay Patel e327266d45 [DAGCombiner] move fadd simplification ahead of other folds
I don't know if it's possible to expose this diff in a test,
but we should always try simplifications (no new nodes created)
before more complicated transforms for efficiency (similar to
what we do in IR).

llvm-svn: 339298
2018-08-08 22:46:30 +00:00
Ties Stuij 083fb1a25c revert '[CodeGen] emit inline asm clobber list warnings for reserved'
llvm-svn: 339274
2018-08-08 17:11:54 +00:00
Jonas Devlieghere caacedb03e [DebugInfo] Fine tune emitting flags as part of the producer
When using APPLE extensions, don't duplicate the compiler invocation's
flags both in AT_producer and AT_APPLE_flags.

Differential revision: https://reviews.llvm.org/D50453

llvm-svn: 339268
2018-08-08 16:33:22 +00:00
Simon Pilgrim 4d4220fa2a [DAG] DAGCombiner::visitSDIVLike - remove unnecessary isConstOrConstSplat call. NFCI.
The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly.

llvm-svn: 339262
2018-08-08 15:37:52 +00:00
Ties Stuij 52f3631f4b [CodeGen] emit inline asm clobber list warnings for reserved
Summary:
Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.

For example:


```
extern int bar(int[]);

int foo(int i) {
  int a[i]; // VLA
  asm volatile(
      "mov r7, #1"
    :
    :
    : "r7"
  );

  return 1 + bar(a);
}
```

Compiled for thumb, this gives:
```
$ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
...
foo:
        .fnstart
@ %bb.0:                                @ %entry
        .save   {r4, r5, r6, r7, lr}
        push    {r4, r5, r6, r7, lr}
        .setfp  r7, sp, #12
        add     r7, sp, #12
        .pad    #4
        sub     sp, #4
        movs    r1, #7
        add.w   r0, r1, r0, lsl #2
        bic     r0, r0, #7
        sub.w   r0, sp, r0
        mov     sp, r0
        @APP
        mov.w   r7, #1
        @NO_APP
        bl      bar
        adds    r0, #1
        sub.w   r4, r7, #12
        mov     sp, r4
        pop     {r4, r5, r6, r7, pc}
...
```

r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.

This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward.  Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.

The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.

If we find a reserved register, we print a warning:
```
repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
      "mov r7, #1"
      ^
```

Reviewers: eli.friedman, olista01, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: efriedma, eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D49727

llvm-svn: 339257
2018-08-08 15:15:59 +00:00
Simon Pilgrim 164e8b0b5c [TargetLowering] BuildUDIV - Add support for divide by one (PR38477)
Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike.

I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x

llvm-svn: 339254
2018-08-08 14:51:19 +00:00
Simon Pilgrim e4a4cf5a8b [TargetLowering] Remove APInt divisor argument from BuildExactSDIV (NFCI).
As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally.

I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future).

llvm-svn: 339246
2018-08-08 13:59:44 +00:00
Ties Stuij 81f1fbdf5a test commit access
Summary: changing a few typos

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50445

llvm-svn: 339245
2018-08-08 13:51:13 +00:00
Simon Pilgrim a10cfcc1db [TargetLowering] BuildUDIV - Early out for divide by one (PR38477)
We're not handling the UDIV by one special case properly - for now just early out.

llvm-svn: 339229
2018-08-08 10:00:54 +00:00
Thomas Preud'homme 4107b31df2 Support inline asm with multiple 64bit output in 32bit GPR
Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR).

Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45437

llvm-svn: 339225
2018-08-08 09:35:26 +00:00
Craig Topper 49ed49fcb1 [SelectionDAG] When splitting scatter nodes during DAGCombine, create a serial chain dependency.
Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code.

Differential Revision: https://reviews.llvm.org/D50374

llvm-svn: 339157
2018-08-07 17:35:02 +00:00
Simon Pilgrim 1bfadb0499 [DAG] Allow non-uniform constant vectors to call BuildSDIV
This was missed in D50185.

NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs.

llvm-svn: 339147
2018-08-07 14:50:39 +00:00
Simon Pilgrim 6943e39353 [TargetLowering] Use pre-computed Shift value type in BuildUDIV (NFCI)
This was missed in D49248

llvm-svn: 339146
2018-08-07 14:40:21 +00:00
Jonas Devlieghere 42243df3b9 Fix inconsistency with/without debug information (-g)
This fixes an inconsistency in code generation when compiling with or
without debug information (-g). When debug information is available in
an empty block, the original test would fail, resulting in possibly
different code.

Patch by: Jeroen Dobbelaere

Differential revision: https://reviews.llvm.org/D49467

llvm-svn: 339129
2018-08-07 12:14:01 +00:00
Pavel Labath 2f0881160c [DebugInfo] Reduce debug_str_offsets section size
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).

Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.

This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.

This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).

Reviewers: probinson, dblaikie, JDevlieghere

Subscribers: aprantl, mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D49493

llvm-svn: 339122
2018-08-07 09:54:52 +00:00
Simon Pilgrim 7e18938793 [TargetLowering] Add support for non-uniform vectors to BuildUDIV
This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.

It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV.

Differential Revision: https://reviews.llvm.org/D49248

llvm-svn: 339121
2018-08-07 09:51:34 +00:00
Craig Topper 9de1797c50 [SelectionDAG][X86] Rename MaskedLoadSDNode::getSrc0 to getPassThru.
Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from.

llvm-svn: 339101
2018-08-07 06:52:49 +00:00
Craig Topper 17989208a9 [SelectionDAG][X86] Rename getValue to getPassThru for gather SDNodes.
getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather.

llvm-svn: 339096
2018-08-07 06:13:40 +00:00
Reid Kleckner 5327805d7c Fix a -Wsign-compare
llvm-svn: 339059
2018-08-06 21:26:47 +00:00
Reid Kleckner 15e91c3235 [X86] Fix assertion in subreg extraction
This assert fires when attempting to extract a subregister from the
global PIC base register. This virtual register SD node is not in the
VRBaseMap, so we shouldn't call getVR to look it up there. If this is a
RegisterSDNode, we should be able to use the virtual register directly.

Fixes PR38385

llvm-svn: 339056
2018-08-06 21:16:16 +00:00
Wei Mi 3c1c088500 [RegisterCoalescer] Delay live interval update work until the rematerialization
for all the uses from the same def is done.

We run into a compile time problem with flex generated code combined with
`-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants
outside of a big loop, and drastically increases the compile time in global
register splitting and copy coalescing.  https://reviews.llvm.org/D49353
relieves the problem in global splitting. This patch is to handle the problem
in copy coalescing.

About the situation where the problem in copy coalescing happens. After
machineLICM, we have several defs outside of a big loop with hundreds or
thousands of uses inside the loop. Rematerialization in copy coalescing
happens for each use and everytime rematerialization is done, shrinkToUses
will be called to update the huge live interval. Because we have 'n' uses
for a def, and each live interval update will have at least 'n' complexity,
the total update work is n^2.

To fix the problem, we try to do the live interval update work in a collective
way. If a def has many copylike uses larger than a threshold, each time
rematerialization is done for one of those uses, we won't do the live interval
update in time but delay that work until rematerialization for all those uses
are completed, so we only have to do the live interval update work once.

Delaying the live interval update could potentially change the copy coalescing
result, so we hope to limit that change to those defs with many
(like above a hundred) copylike uses, and the cutoff can be adjusted by the
option -mllvm -late-remat-update-threshold=xxx.

Differential Revision: https://reviews.llvm.org/D49519

llvm-svn: 339035
2018-08-06 17:30:45 +00:00
Hsiangkai Wang ef72e481ea [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.

In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.

DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.

Differential Revision: https://reviews.llvm.org/D50220

llvm-svn: 338984
2018-08-06 03:59:47 +00:00
Aditya Nandakumar e07b3b737b [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOP
https://reviews.llvm.org/D48600

Added IRTranslator support to translate these known intrinsics into GISel opcodes.

llvm-svn: 338944
2018-08-04 01:22:12 +00:00
Craig Topper c4960582ec [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store.
The mask operand is visited before the data operand so we need to be able to widen it.

Fixes PR38436.

llvm-svn: 338915
2018-08-03 20:14:18 +00:00
Matt Arsenault c3dc8e65e2 DAG: Enhance isKnownNeverNaN
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.

Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.

llvm-svn: 338910
2018-08-03 18:27:52 +00:00
Simon Pilgrim 94112ebc75 [TargetLowering] Generalise BuildSDIV function
First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used.

Differential Revision: https://reviews.llvm.org/D50185

llvm-svn: 338838
2018-08-03 10:00:54 +00:00
Eli Friedman 1ba5e9ac24 [GlobalMerge] Allow merging globals with explicit section markings.
At least on ELF, it's impossible to tell from the object file whether
two globals with the same section marking were merged: the merged global
uses "private" linkage to hide its symbol, and the aliases look like
regular symbols. I can't think of any other reason to disallow it.
(Of course, we can only merge globals in the same section.)

The weird alignment handling matches AsmPrinter; our alignment handling
for global variables should probably be refactored.

Differential Revision: https://reviews.llvm.org/D49822

llvm-svn: 338791
2018-08-02 23:54:16 +00:00
Matt Arsenault 1f3977a856 DAG: Fix vector widening fcanonicalize
llvm-svn: 338715
2018-08-02 13:43:53 +00:00
Alexander Ivchenko 49168f6778 [GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value
This is logical continuation of https://reviews.llvm.org/D46018 (r332449)

Differential Revision: https://reviews.llvm.org/D49660

llvm-svn: 338685
2018-08-02 08:33:31 +00:00
Lei Liu b9a7b7a84d Fix FCOPYSIGN expansion
In expansion of FCOPYSIGN, the shift node is missing when the two
operands of FCOPYSIGN are of the same size. We should always generate
shift node (if the required shift bit is not zero) to put the sign
bit into the right position, regardless of the size of underlying
types.

Differential Revision: https://reviews.llvm.org/D49973

llvm-svn: 338665
2018-08-02 01:54:12 +00:00
Lei Liu 8e422b8403 [AArch64] DWARF: do not generate AT_location for thread local
AArch64 ELF ABI does not define a static relocation type for TLS offset within
a module, which makes it impossible for compiler to generate a valid
DW_AT_location content for thread local variables. Currently LLVM generates an
invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS
variable. That causes trouble for linker because thread local variable does
not have an absolute address at link time. AArch64 GCC solves the problem by
not generating DW_AT_location for thread local variables. We should do the
same in LLVM.

Differential Revision: https://reviews.llvm.org/D43860

llvm-svn: 338655
2018-08-01 23:46:49 +00:00
Alexey Bataev d4dd7215f6 [DEBUGINFO] Disable emission of the dwarf sections, but allow directives.
Summary:
Added an option that allows to emit only '.loc' and '.file' kind debug
directives, but disables emission of the DWARF sections. Required for
NVPTX target to support profiling. It requires '.loc' and '.file'
directives, but does not require any DWARF sections for the profiler.

Reviewers: probinson, echristo, dblaikie

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46021

llvm-svn: 338616
2018-08-01 19:38:20 +00:00
Michael Berg d3ce4c3d94 [NFC] small addendum to r334242, FMF propagation
llvm-svn: 338604
2018-08-01 18:06:49 +00:00
Sanjay Patel 8aac22e06a [SelectionDAG] fix bug in translating funnel shift with non-power-of-2 type
The bug is visible in the constant-folded x86 tests. We can't use the
negated shift amount when the type is not power-of-2:
https://rise4fun.com/Alive/US1r

...so in that case, use the regular lowering that includes a select
to guard against a shift-by-bitwidth. This path is improved by only
calculating the modulo shift amount once now.

Also, improve the rotate (with power-of-2 size) lowering to use
a negate rather than subtract from bitwidth. This improves the
codegen whether we have a rotate instruction or not (although
we can still see that we're not matching to a legal rotate in
all cases).

llvm-svn: 338592
2018-08-01 17:17:08 +00:00
Simon Pilgrim a3548c960e [SelectionDAG] Make binop reduction matcher available to all targets
There is nothing x86-specific about this code, so it'd be nice to make this available for other targets to use in the future (and get it out of X86ISelLowering!).

Differential Revision: https://reviews.llvm.org/D50083

llvm-svn: 338586
2018-08-01 16:52:28 +00:00
Cameron McInally 04ae85859d [FPEnv] Widen illegal width StrictFP vector operations as needed
Differential Revision: https://reviews.llvm.org/D49806

llvm-svn: 338562
2018-08-01 14:17:19 +00:00
Jonas Devlieghere 8acb74e01f [MC] Report fatal error for DWARF types for non-ELF object files
Getting the DWARF types section is only implemented for ELF object
files. We already disabled emitting debug types in clang (r337717), but
now we also report an fatal error (rather than crashing) when trying to
obtain this section in MC. Additionally we ignore the generate debug
types flag for unsupported target triples.

See PR38190 for more information.

Differential revision: https://reviews.llvm.org/D50057

llvm-svn: 338527
2018-08-01 12:53:06 +00:00
Victor Leschuk 64e0c56717 [DWARF] Basic support for producing DWARFv5 .debug_addr section
This revision implements support for generating DWARFv5 .debug_addr section.
The implementation is pretty straight-forward: we just check the dwarf version
and emit section header if needed.

Reviewers: aprantl, dblaikie, probinson

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D50005

llvm-svn: 338487
2018-08-01 05:48:06 +00:00
Amara Emerson 6cdfe29d8e [GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate.
Previously we were just visiting the blocks in the function in IR order, which
is rather arbitrary. Therefore we wouldn't always visit defs before uses, but
the translation code relies on this assumption in some places.

Only codegen change seen in tests is an elision of a redundant copy.

Fixes PR38396

llvm-svn: 338476
2018-08-01 02:17:42 +00:00
Eric Christopher 7a70be6865 Simplify selectELFSectionForGlobal by pulling out the entry size
determination for mergeable sections into a small static function.

llvm-svn: 338469
2018-08-01 01:29:30 +00:00
Eric Christopher ad36c74562 Tidy up logic around unique section name creation and remove a
mostly unused variable.

llvm-svn: 338468
2018-08-01 01:03:34 +00:00
Eli Friedman da08078fb2 [MachineOutliner] Clean up subtarget handling.
Call shouldOutlineFromFunctionByDefault, isFunctionSafeToOutlineFrom,
getOutliningType, and getMachineOutlinerMBBFlags using the correct
TargetInstrInfo. And don't create a MachineFunction for a function
declaration.

The call to getOutliningCandidateInfo is still a little weird, but at
least the weirdness is explicitly called out.

Differential Revision: https://reviews.llvm.org/D49880

llvm-svn: 338465
2018-08-01 00:37:20 +00:00
Wolfgang Pieb baf94f830b [DWARF] Do not create a .debug_ranges section when no ranges are needed.
Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50089

llvm-svn: 338437
2018-07-31 20:56:32 +00:00
Matt Arsenault dcec0888e2 DAG: Correct pointer type used for stack slot
Correct the address space for the inserted argument
stack slot.

AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.

llvm-svn: 338428
2018-07-31 19:51:20 +00:00
Vlad Tsyrklevich 48ed9acede Revert "[DebugInfo] Generate DWARF debug information for labels."
This reverts commits r338390 and r338398, they were causing LSan
failures on the ASan bot.

llvm-svn: 338408
2018-07-31 18:10:37 +00:00
Hsiangkai Wang cbc58ada99 [DebugInfo] Generate DWARF debug information for labels.
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 338390
2018-07-31 14:48:32 +00:00
Matt Arsenault a5ed032118 DAG: Fix PromoteFloatResult for fcanonicalize
llvm-svn: 338382
2018-07-31 14:15:22 +00:00
Hsiangkai Wang 615540d0f2 Test commit.
llvm-svn: 338352
2018-07-31 06:09:29 +00:00
Amara Emerson 6aff5a7810 [GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.
Differential Revision: https://reviews.llvm.org/D49900

llvm-svn: 338335
2018-07-31 00:08:50 +00:00
Craig Topper 2f60ef2c78 [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.

llvm-svn: 338329
2018-07-30 23:22:00 +00:00
Sanjay Patel 9f807f44b1 [DAGCombiner] transform sub-of-shifted-signbit to add
This is exchanging a sub-of-1 with add-of-minus-1:
https://rise4fun.com/Alive/plKAH

This is another step towards improving select-of-constants codegen (see D48970).

x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral.
I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but
I think canonicalizing to 'add' is more likely to produce further transforms because we have more
folds for 'add'.

Differential Revision: https://reviews.llvm.org/D49924

llvm-svn: 338317
2018-07-30 22:21:37 +00:00
Craig Topper 42d312bb83 [TargetLowering] In BuildSDIV, add the MULHS/SMUL_LOHI to the Created vector.
BuildUDIV was already correct.

llvm-svn: 338304
2018-07-30 21:04:38 +00:00
Craig Topper a568a27dfa [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2.
llvm-svn: 338303
2018-07-30 21:04:34 +00:00
Craig Topper b94d5f853b Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist."
Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead.

So its probably safer to leave this alone.

llvm-svn: 338298
2018-07-30 20:27:10 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
David Bolvansky 2fa7fb14ea [DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a rotate can be formed
Summary:
Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed.  This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op.

Patch by: sameconrad (Sam Conrad)

Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar

Reviewed By: lebedev.ri

Subscribers: efriedma, kparzysz, llvm-commits

Differential Revision: https://reviews.llvm.org/D47681

llvm-svn: 338270
2018-07-30 16:50:00 +00:00
Thomas Preud'homme 196149c943 Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"
This reapplies commit r338206 reverted by r338214 since the bug that
r338206 uncovered has been fixed in r338268.

Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338269
2018-07-30 16:48:39 +00:00
Karl-Johan Karlsson e5899447b4 [RegisterScavenger] Fix debug print
llvm-svn: 338231
2018-07-30 08:17:00 +00:00
Craig Topper e978d2ee4a [DAGCombiner] Remove unnecessary calls to AddToWorklist.
The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes.

I've removed the most obvious cases here. There are probably more than can be removed.

llvm-svn: 338222
2018-07-29 18:39:26 +00:00
Sanjay Patel 7312206f2f revert r338206 because the test does not pass
Example of bot failure:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll

llvm-svn: 338214
2018-07-29 14:30:49 +00:00
Thomas Preud'homme 74ffd14e15 Fix crash on inline asm with 64bit matching input in 32bit GPR
Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338206
2018-07-28 21:33:39 +00:00
Craig Topper 9db3573d3a [SelectionDAG] Pass std::vector by reference instead of by pointer to BuildSDIV/BuildUDIV.
This removes the need for an assert to ensure the pointer isn't null.

Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here.

llvm-svn: 338205
2018-07-28 19:44:20 +00:00
Matt Arsenault 81920b0a25 DAG: Add calling convention argument to calling convention funcs
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.

llvm-svn: 338194
2018-07-28 13:25:19 +00:00
Craig Topper 50b1d4303d [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)
This can be useful since addition is commutable, and subtraction is not.

This matches a transform that is also done by InstCombine.

llvm-svn: 338181
2018-07-28 00:27:25 +00:00
Jessica Paquette 9d93c6026a [MachineOutliner] Exit getOutliningCandidateInfo when we erase all candidates
There was a missing check for if a candidate list was entirely deleted. This
adds that check.

This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll
with the MachineOutliner enabled.

llvm-svn: 338148
2018-07-27 18:21:57 +00:00
Sanjay Patel c7abb416dc [DAGCombiner] fold 'not' with signbit math
This is a follow-up suggested in D48970. 

Alive proofs:
https://rise4fun.com/Alive/sII

We can eliminate an instruction in the usual select-of-constants 
to bit hack transform by adjusting the add/sub with constant.
This is always a win. 

There are more transforms that are likely wins, but they may need 
target hooks in case some targets do not benefit. 

This is another step towards making up for canonicalizing to 
select-of-constants in rL331486.

llvm-svn: 338132
2018-07-27 16:42:55 +00:00
Matt Arsenault 611dff423c DAG: Remove unnecessary .str()
llvm-svn: 338112
2018-07-27 09:04:41 +00:00
Craig Topper 1a40a06549 [SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling DAG.setRoot.
Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load.

This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization.

llvm-svn: 338085
2018-07-26 23:22:11 +00:00
Wolfgang Pieb 9ea65082ff [DWARF v5] Reposting r337981, which was reverted in r337997 due to a test failure in debuginfo_tests.
The test failure was caused by the compiler not emitting a __debug_ranges section with DWARF 4 and
earlier when no ranges are needed. The test checks for the existence regardless.

llvm-svn: 338081
2018-07-26 22:48:52 +00:00
Craig Topper 8da280f50b [SelectionDAG] Add MLOAD/MSTORE/MGATHER/MSCATTER to AddNodeIDCustom to properly calculate their folding set ID to allow them to be CSEd.
llvm-svn: 338080
2018-07-26 22:40:24 +00:00
Craig Topper 8b5a2f7aac [DAGCombiner] Remove some calls to AddToWorklist that should be unnecessary.
The DAGCombiner has a system for ensuring all nodes are visited. It doesn't require an AddToWorkList for every node that is created by a combine.

llvm-svn: 338079
2018-07-26 22:40:22 +00:00
Tim Renouf c0a44791e7 [RegisterCoalescer] Fixed inconsistent followCopyChain with subreg
Summary:
The behavior of followCopyChain with a subreg depends on the order in
which subranges appear in a live interval, which is bad.

This commit fixes that, and allows the copy chain to continue only if
all matching subranges that are not undefined take us to the same def.

I don't have a test for this; the reproducer I had on my branch with
various other local changes does not reproduce the problem on upstream
llvm. Also that reproducer was an ll test; attempting to convert it to a
mir test made the subranges appear in a different order and hid the
problem.

However I would argue that the old behavior was obviously wrong
and needs fixing.

Subscribers: MatzeB, qcolombet, llvm-commits

Differential Revision: https://reviews.llvm.org/D49535

Change-Id: Iee7936ef305918f3b498ac432e2cf651ae5cc2df
llvm-svn: 338070
2018-07-26 21:27:34 +00:00
Vedant Kumar b572f64212 [DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst users
LowerDbgDeclare inserts a dbg.value before each use of an address
described by a dbg.declare. When inserting a dbg.value before a CallInst
use, however, it fails to append DW_OP_deref to the DIExpression.

The DW_OP_deref is needed to reflect the fact that a dbg.value describes
a source variable directly (as opposed to a dbg.declare, which relies on
pointer indirection).

This patch adds in the DW_OP_deref where needed. This results in the
correct values being shown during a debug session for a program compiled
with ASan and optimizations (see https://reviews.llvm.org/D49520). Note
that ConvertDebugDeclareToDebugValue is already correct -- no changes
there were needed.

One complication is that SelectionDAG is unable to distinguish between
direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also
fixes this long-standing issue in order to not regress integration tests
relying on the incorrect assumption that all frame-index SDDbgValues are
indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot
be lowered properly otherwise. Basically the fix prevents a direct
SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice
by a debugger. There were a handful of tests relying on this incorrect
"FRAMEIX => indirect" assumption which actually had incorrect
DW_AT_locations: these are all fixed up in this patch.

Testing:

- check-llvm, and an end-to-end test using lldb to debug an optimized
  program.
- Existing unit tests for DIExpression::appendToStack fully cover the
  new DIExpression::append utility.
- check-debuginfo (the debug info integration tests)

Differential Revision: https://reviews.llvm.org/D49454

llvm-svn: 338069
2018-07-26 20:56:53 +00:00
Matthias Braun 09810c9269 MacroFusion: Fix macro fusion with ExitSU failing in top-down scheduling
When fusing instructions A and B, we must add all predecessors of B as
predecessors of A to avoid instructions getting scheduling in between.

There is a special case involving ExitSU: Every other node must be
scheduled before it by design and we don't need to make this explicit in
the graph, however when fusing with a different node we need to schedule
every othere node before the fused node too and we need to make this
explicit now: This patch adds a dependency from the fused node to all
roots in the graph.

Differential Revision: https://reviews.llvm.org/D49830

llvm-svn: 338046
2018-07-26 17:43:56 +00:00
Roman Lebedev 41ba5c1455 [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes.
Summary:
A follow-up for D49266 / rL337166.

At least one of these cases is more canonical,
so we really do have to handle it.
https://godbolt.org/g/pkzP3X
https://rise4fun.com/Alive/pQyhZZ

We won't get to these cases with I1 being -1,
as that will be constant-folded to true or false.

I'm also not sure we actually hit the 'ule' case,
but i think the worst think that could happen is that being dead code.

Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49497

llvm-svn: 338044
2018-07-26 17:34:28 +00:00
Alexey Bataev 4dd7558fab [DEBUGINFO, NVPTX] Emit correct debug information for local variables.
Summary:
NVPTX target dos not use register-based frame information. Instead it
relies on the artificial local_depot that is used instead of the frame
and the data for variables must be emitted relatively to this
local_depot.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45963

llvm-svn: 338039
2018-07-26 16:29:52 +00:00
Alexey Bataev 7ae86fe71c [DEBUGINFO, NVPTX] Set `DW_AT_frame_base` to `DW_OP_call_frame_cfa`.
Summary:
For NVPTX target the value of `DW_AT_frame_base` attribute must be set
to `DW_OP_call_frame_cfa`.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45785

llvm-svn: 338036
2018-07-26 16:10:05 +00:00
Pavel Labath 7bfa5d6544 dwarfgen: Add support for generating the debug_str_offsets section, take 3
Previous version of this patch failed on darwin targets because of
different handling of cross-debug-section relocations. This fixes the
tests to emit the DW_AT_str_offsets_base attribute correctly in both
cases. Since doing this is a non-trivial amount of code, and I'm going
to need it in more than one test, I've added a helper function to the
dwarfgen DIE class to do it.

Original commit message follows:

The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.

This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
  DwarfStringPool, so I can generate the section header more easily from
  the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
  generate the DW_AT_str_offsets_base, which links a compile unit to the
  offset table.

I've also added a basic test for reading and writing DW_form_strx forms.

Reviewers: dblaikie, JDevlieghere, probinson

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D49670

llvm-svn: 338031
2018-07-26 14:36:07 +00:00
Martin Storsjo 9dafd6f6d9 Revert "[COFF] Use comdat shared constants for MinGW as well"
This reverts commit r337951.

While that kind of shared constant generally works fine in a MinGW
setting, it broke some cases of inline assembly that worked before:

$ cat const-asm.c
int MULH(int a, int b) {
    int rt, dummy;
    __asm__ (
        "imull %3"
        :"=d"(rt), "=a"(dummy)
        :"a"(a), "rm"(b)
    );
    return rt;
}
int func(int a) {
    return MULH(a, 1);
}
$ clang -target x86_64-win32-gnu -c const-asm.c -O2
const-asm.c:4:9: error: invalid variant '00000001'
        "imull %3"
        ^
<inline asm>:1:15: note: instantiated into assembly here
        imull __real@00000001(%rip)
                     ^

A similar error is produced for i686 as well. The same test with a
target of x86_64-win32-msvc or i686-win32-msvc works fine.

llvm-svn: 338018
2018-07-26 10:48:20 +00:00
Alex Lorenz 7d808c19ff Revert r337981: it breaks the debuginfo-tests
This commit caused a regression in the debuginfo-tests:

FAIL: debuginfo-tests :: apple-accel.cpp (40748 of 46595)
llvm-svn: 337997
2018-07-26 03:21:40 +00:00
Amara Emerson fdd089aa14 [GlobalISel] Fall back to SDISel for swifterror/swiftself attributes.
We don't currently support these, fall back until we do.

llvm-svn: 337994
2018-07-26 01:25:58 +00:00
Matthias Braun 5c1e23b2e3 RegUsageInfo: Cleanup; NFC
- Remove unnecessary anchor function
- Remove unnecessary override of getAnalysisUsage
- Use reference instead of pointers where things cannot be nullptr
- Use ArrayRef instead of std::vector where possible

llvm-svn: 337989
2018-07-26 00:27:51 +00:00
Matthias Braun bde0806d5f CodeGen.cpp: Sort initializers; NFC
llvm-svn: 337988
2018-07-26 00:27:49 +00:00
Matthias Braun 57dd5b3dea CodeGen: Cleanup regmask construction; NFC
- Avoid duplication of regmask size calculation.
- Simplify allocateRegisterMask() call.
- Rename allocateRegisterMask() to allocateRegMask() to be consistent
  with naming in MachineOperand.

llvm-svn: 337986
2018-07-26 00:27:47 +00:00
Wolfgang Pieb c42087df7c [DWARF v5] Don't emit multiple DW_AT_rnglists_base attributes. Some refactoring of
range lists emissions and added test cases.

Reviewer: dblaikie

Differential Revision: https://reviews.llvm.org/D49522

llvm-svn: 337981
2018-07-25 23:03:22 +00:00
Eli Friedman d6baff65f7 [GlobalMerge] Handle llvm.compiler.used correctly.
Reuse the handling for llvm.used, and don't transform such globals.

Fixes a failure on the asan buildbot caused by my previous commit.

llvm-svn: 337973
2018-07-25 22:03:35 +00:00
Sanjay Patel 215dcbf4db [SelectionDAG] try to convert funnel shift directly to rotate if legal
If the DAGCombiner's rotate matching was working as expected, 
I don't think we'd see any test diffs here. 

This sidesteps the issue of custom lowering for rotates raised in PR38243:
https://bugs.llvm.org/show_bug.cgi?id=38243
...by only dealing with legal operations.

llvm-svn: 337966
2018-07-25 21:38:30 +00:00
Eli Friedman 0887cf9cab [GlobalMerge] Allow merging globals with arbitrary alignment.
Instead of depending on implicit padding from the structure layout code,
use a packed struct and emit the padding explicitly.

Differential Revision: https://reviews.llvm.org/D49710

llvm-svn: 337961
2018-07-25 20:58:01 +00:00
Martin Storsjo ff33a95ed4 [COFF] Use comdat shared constants for MinGW as well
GNU binutils tools have no problems with this kind of shared constants,
provided that we actually hook it up completely in AsmPrinter and
produce a global symbol.

This effectively reverts SVN r335918 by hooking the rest of it up
properly.

This feature was implemented originally in SVN r213006, with no reason
for why it can't be used for MinGW other than the fact that GCC doesn't
do it while MSVC does.

Differential Revision: https://reviews.llvm.org/D49646

llvm-svn: 337951
2018-07-25 18:35:42 +00:00
Martin Storsjo d2662c32fb [COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinter
In SVN r334523, the first half of comdat constant pool handling was
hoisted from X86WindowsTargetObjectFile (which despite the name only
was used for msvc targets) into the arch independent
TargetLoweringObjectFileCOFF, but the other half of the handling was
left behind in X86AsmPrinter::GetCPISymbol.

With only half of the handling in place, inconsistent comdat
sections/symbols are created, causing issues with both GNU binutils
(avoided for X86 in SVN r335918) and with the MS linker, which
would complain like this:

fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4

Differential Revision: https://reviews.llvm.org/D49644

llvm-svn: 337950
2018-07-25 18:35:31 +00:00
Ulrich Weigand 5f75371c5d Fix corruption of result number in LegalizeVectorOps.cpp
When VectorLegalizer::LegalizeOp creates a new SDValue after iterating
over its arguments, we need to refer to the same result number of the
new node that the original value used.

Reviewed by: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D49805

llvm-svn: 337939
2018-07-25 17:08:13 +00:00
Pavel Labath da3c4fb5fe Revert "dwarfgen: Add support for generating the debug_str_offsets section, take 2"
This reverts commit r337933. The build error is fixed but the test now
fails on the darwin buildbots. Investigating...

llvm-svn: 337935
2018-07-25 16:34:43 +00:00
Pavel Labath 78ab659bb4 dwarfgen: Add support for generating the debug_str_offsets section, take 2
This recommits r337910 after fixing an "ambiguous call to addAttribute"
error with some compilers (gcc circa 4.9 and MSVC). It seems that these
compilers will consider a "false -> pointer" conversion during overload
resolution. This creates ambiguity because one I added an overload which
takes a MCExpr * as an argument.

I fix this by making the new overload take MCExpr&, which avoids the
conversion. It also documents the fact that we expect a valid MCExpr
object.

Original commit message follows:

The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.

This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
  DwarfStringPool, so I can generate the section header more easily from
  the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
  generate the DW_AT_str_offsets_base, which links a compile unit to the
  offset table.

I've also added a basic test for reading and writing DW_form_strx forms.

Reviewers: dblaikie, JDevlieghere, probinson

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D49670

llvm-svn: 337933
2018-07-25 15:33:32 +00:00
Pavel Labath b4e17c29dd Revert "dwarfgen: Add support for generating the debug_str_offsets section"
This reverts commit r337910 as it's generating "ambiguous call to
addAttribute" errors on some bots.

Will resubmit once I get a chance to look into the problem.

llvm-svn: 337924
2018-07-25 12:52:30 +00:00
Pavel Labath 7a59e3bf37 dwarfgen: Add support for generating the debug_str_offsets section
Summary:
The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.

This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
  DwarfStringPool, so I can generate the section header more easily from
  the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
  generate the DW_AT_str_offsets_base, which links a compile unit to the
  offset table.

I've also added a basic test for reading and writing DW_form_strx forms.

Reviewers: dblaikie, JDevlieghere, probinson

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D49670

llvm-svn: 337910
2018-07-25 11:55:59 +00:00
Thomas Preud'homme 768d6ce4a3 Fix PR34170: Crash on inline asm with 64bit output in 32bit GPR
Add support for inline assembly with output operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR as in the PR).

llvm-svn: 337903
2018-07-25 11:11:12 +00:00
Tom Stellard 179757ef05 [RegisterBankInfo] Ignore InstrMappings that create impossible to repair operands
Summary:
This is a follow-up to r303043.  In computeMapping(), we need to disqualify an
InstrMapping if it would be impossible to repair one of the registers in the
instruction to match the mapping.

This change is needed in order to be able to define an instruction
mapping for G_SELECT for the AMDGPU target and will be tested
by test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir

Reviewers: ab, qcolombet, t.p.northover, dsanders

Reviewed By: qcolombet

Subscribers: tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D49735

llvm-svn: 337882
2018-07-25 03:08:35 +00:00
Jessica Paquette 58e706a66a [MachineOutliner][NFC] Move outlined function remark into its own function
This pulls the OutlinedFunction remark out into its own function to make
the code a bit easier to read.

llvm-svn: 337849
2018-07-24 20:20:45 +00:00
Jessica Paquette 69f517df27 [MachineOutliner][NFC] Move target frame info into OutlinedFunction
Just some gardening here.

Similar to how we moved call information into Candidates, this moves outlined
frame information into OutlinedFunction. This allows us to remove
TargetCostInfo entirely.

Anywhere where we returned a TargetCostInfo struct, we now return an
OutlinedFunction. This establishes OutlinedFunctions as more of a general
repeated sequence, and Candidates as occurrences of those repeated sequences.

llvm-svn: 337848
2018-07-24 20:13:10 +00:00
Peter Collingbourne e06bac4796 Put "built-in" function definitions in global Used list, for LTO. (fix bug 34169)
When building with LTO, builtin functions that are defined but whose calls have not been inserted yet, get internalized. The Global Dead Code Elimination phase in the new LTO implementation then removes these function definitions. Later optimizations add calls to those functions, and the linker then dies complaining that there are no definitions. This CL fixes the new LTO implementation to check if a function is builtin, and if so, to not internalize (and later DCE) the function. As part of this fix I needed to move the RuntimeLibcalls.{def,h} files from the CodeGen subidrectory to the IR subdirectory. I have updated all the files that accessed those two files to access their new location.

Fixes PR34169

Patch by Caroline Tice!

Differential Revision: https://reviews.llvm.org/D49434

llvm-svn: 337847
2018-07-24 19:34:37 +00:00
Jessica Paquette fca55129b1 [MachineOutliner][NFC] Make Candidates own their call information
Before this, TCI contained all the call information for each Candidate.

This moves that information onto the Candidates. As a result, each Candidate
can now supply how it ought to be called. Thus, Candidates will be able to,
say, call the same function in cheaper ways when possible. This also removes
that information from TCI, since it's no longer used there.

A follow-up patch for the AArch64 outliner will demonstrate this.

llvm-svn: 337840
2018-07-24 17:42:11 +00:00
Jessica Paquette 1cc52a0079 [MachineOutliner][NFC] Move missed opt remark into its own function
Having the missed remark code in the middle of `findCandidates` made the
function hard to follow. This yanks that out into a new function,
`emitNotOutliningCheaperRemark`.

llvm-svn: 337839
2018-07-24 17:37:28 +00:00
Jessica Paquette f94d1d29c1 [MachineOutliner][NFC] Sink some candidate logic into OutlinedFunction
Just some simple gardening to improve clarity.

Before, we had something along the lines of

1) Create a std::vector of Candidates
2) Create an OutlinedFunction
3) Create a std::vector of pointers to Candidates
4) Copy those over to the OutlinedFunction and the Candidate list

Now, OutlinedFunctions create the Candidate pointers. They're still copied
over to the main list of Candidates, but it makes it a bit clearer what's
going on.

llvm-svn: 337838
2018-07-24 17:36:13 +00:00
Shiva Chen f5938bfbf9 Revert "[DebugInfo] Generate DWARF debug information for labels."
This reverts commit b454fa1b4079b6c0a5b1565982d16516385838d7.

llvm-svn: 337812
2018-07-24 06:17:45 +00:00
Shiva Chen d6b2cdf9d4 [DebugInfo] Generate DWARF debug information for labels.
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

Differential Revision: https://reviews.llvm.org/D45556

Patch by Hsiangkai Wang.

llvm-svn: 337799
2018-07-24 02:22:55 +00:00
Martin Storsjo db42d51ee3 [MC] Add a separate flag for skipping comdat constant sections for MinGW. NFC.
This actually has nothing to do with the associative comdat sections
that aren't supported by GNU binutils ld.

Clarify the comments from SVN r335918 and use a separate flag for it.

Differential Revision: https://reviews.llvm.org/D49645

llvm-svn: 337757
2018-07-23 22:15:25 +00:00
Vedant Kumar 22bd6f99fa [SelectionDAG] Reduce DanglingDebugInfo memory traffic, NFC
This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and
debug -O2 builds of the sqlite3 amalgamation.

llvm-svn: 337751
2018-07-23 21:59:04 +00:00
David Greene 4345df3dad Fix RegScavenger::unprocess
RegScavenger::unprocess walks backward, so it should undo the effects
of defs before undoing effects of kills. Previously it did things in
the opposite order, leaving a register apparently unused (dead) in the
case where an instruction both used (killed) and defined a register.

Differential Revision: https://reviews.llvm.org/D42200

llvm-svn: 337735
2018-07-23 20:23:50 +00:00
Nirav Dave eac2ca4e28 [Legalize] Elide MERGE_VALUES created by scalarizeVectorLoad.
scalarizeVectorLoad creates MERGE_VALUES nodes which are immediately
decomposed in expandLoad. Elide the node in these cases.

llvm-svn: 337708
2018-07-23 16:43:42 +00:00
Cameron McInally 2c9bcffc92 [FPEnv] Legalize double width StrictFP vector operations
Differential Revision: https://reviews.llvm.org/D48809

llvm-svn: 337698
2018-07-23 14:40:17 +00:00
Craig Topper b8b21d2fca [SelectionDAGBuilder] Use APInt::isZero instead of comparing APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t.
This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value.

llvm-svn: 337652
2018-07-22 05:16:50 +00:00
Craig Topper bf61113e4f [SelectionDAGBuilder] Restrict vector reduction check to types with a power of 2 number of elements.
The check for the shuffles usages probably isn't correct for non power of 2 vectors.

llvm-svn: 337651
2018-07-22 05:16:49 +00:00
Pavel Labath 0120691f53 Fix build breakage from r337562
I changed a variable's type from pointer to reference, but forgot to
update the assert-only code.

llvm-svn: 337564
2018-07-20 15:40:24 +00:00
Nirav Dave 25802ac9fd [DAG] Avoid Node Update assertion due to AND simplification
Check for construction-time folding for incomplete AND nodes in
BackwardsPropagateMask.

Fixes PR38185.

Reviewers: RKSimon, samparker

Reviewed By: samparker

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D49444

llvm-svn: 337563
2018-07-20 15:27:24 +00:00
Pavel Labath 7f7e60694e DwarfDebug: Reduce duplication in addAccel*** methods
Summary:
Each of the four methods had a dozen lines and was doing almost exactly
the same thing: get the appropriate accelerator table kind and insert an
entry into it. I move this common logic to a helper function and make
these methods delegate to it.

This came up in the context of D49493, where I've needed to make adding
a string to a string pool slightly more complicated, and it seemed to
make sense to do it in one place instead of five.

To make this work I've needed to unify the interface of the AccelTable
data types, as some used to store DIE& and others DIE*. I chose to unify
to a reference as that's what the caller uses.

This technically isn't NFC, because it changes the StringPool used for
apple tables in the DWO case (now it uses the main file like DWARF v5
instead of the DWO file). However, that shouldn't matter, as DWO is not
a thing on apple targets (clang frontend simply ignores -gsplit-dwarf).

Reviewers: JDevlieghere, aprantl, probinson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49542

llvm-svn: 337562
2018-07-20 15:24:13 +00:00
Nirav Dave 5a4e11ad9c [DAG] Fix Memory ordering check in ReduceLoadOpStore.
When merging through a TokenFactor we need to check that the
load may be ordered such that no other aliasing memory operations may
happen. It is not sufficient to just check that the load is a member
of the chain token factor as it there may be a indirect chain. Require
the load's chain has only one use.

This fixes PR37826.

Reviewers: spatel, davide, efriedma, craig.topper, RKSimon

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D49388

llvm-svn: 337560
2018-07-20 15:20:50 +00:00
Pavel Labath f9adc20aef [DebugInfo] Generate .debug_names section when it makes sense
Summary:
This patch makes us generate the debug_names section in response to some
user-facing commands (previously it was only generated if explicitly
selected via the -accel-tables option).

My goal was to make this work for DWARF>=5 (as it's an official part of
that standard), and also, as an extension, for DWARF<5 if one is
explicitly tuning for lldb as a debugger (because it brings a large
performance improvement there).

This is slightly complicated by the fact that the debug_names tables are
incompatible with the DWARF v4 type units (they assume that the type
units are in the debug_info section), and unfortunately, right now we
generate DWARF v4-style type units even for -gdwarf-5. For this reason,
I disable all accelerator tables if the user requested type unit
generation. I do this even for apple tables, as they have the same
problem (in fact generating type units for apple targets makes us crash
even before we get around to emitting the accelerator tables).

Reviewers: JDevlieghere, aprantl, dblaikie, echristo, probinson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49420

llvm-svn: 337544
2018-07-20 12:59:05 +00:00
Craig Topper d8734450a2 [DAGCombiner] Fold X - (-Y *Z) -> X + (Y * Z)
llvm-svn: 337518
2018-07-20 01:40:03 +00:00
Stephen Canon 8995c5f0f6 Skip out of SimplifyDemandedBits for BITCAST of f16 to i16
Mirrors the existing exit path for f128, avoiding a crash later on.

Differential Revision: https://reviews.llvm.org/D49524

llvm-svn: 337506
2018-07-19 22:46:42 +00:00
Craig Topper c12c5d421f [DAGCombiner] Teach DAGCombiner that A-(-B) is A+B.
We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub.

llvm-svn: 337502
2018-07-19 22:24:43 +00:00
David Blaikie d66140514d [DebugInfo] Dwarfv5: Avoid unnecessary base_address specifiers in rnglists
Since DWARFv5 rnglists are self descriptive and have distinct encodings
for base-relative (offset_pair) and absolute (start_length) entries,
there's no need to use a base address specifier when describing a lone
address range in a section.

Use that, and improve the test coverage a bit here to include cases like
this and others.

llvm-svn: 337411
2018-07-18 18:04:42 +00:00
Nirav Dave a747d3ca60 [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.
Summary:
If unfolding an SUnit results in both load or the operation using it which
already exist in the DAG, abort the unfold if they are already scheduled.
If not, make sure we don't add duplicate dependencies.

This fixes PR37916.

Reviewers: davide, eli.friedman, fhahn, bogner

Subscribers: MatzeB, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48666

llvm-svn: 337409
2018-07-18 18:01:03 +00:00
Wei Mi 10ef7d20b2 [RegAlloc][NFC] Fix the help string of the option "huge-size-for-split".
llvm-svn: 337402
2018-07-18 16:56:33 +00:00
Simon Pilgrim 3cb3056fca Fix -Wdocumentation warning. NFCI.
llvm-svn: 337367
2018-07-18 09:07:54 +00:00
Peter Collingbourne fc50498ced CodeGen: Don't create address significance table entries for thread-local variables.
The presence of these symbols in the symbol table can cause symbol type
mismatch errors (or undefined symbol errors on emulated TLS targets)
and they can't be ICF'd anyway.

llvm-svn: 337338
2018-07-18 00:21:40 +00:00
Peter Collingbourne bd9d313d5c CodeGen: Add a target option for emitting .addrsig directives for all address-significant symbols.
Differential Revision: https://reviews.llvm.org/D48143

llvm-svn: 337331
2018-07-17 22:40:08 +00:00
Tim Renouf e1016f1bc7 More fixes for subreg join failure in RegCoalescer
Summary:
Part of the adjustCopiesBackFrom method wasn't correctly dealing with SubRange
intervals when updating.

2 changes. The first to ensure that bogus SubRange Segments aren't propagated when
encountering Segments of the form [1234r, 1234d:0) when preparing to merge value
numbers. These can be removed in this case.

The second forces a shrinkToUses call if SubRanges end on the copy index
(instead of just the parent register).

V2: Addressed review comments, plus MIR test instead of ll test

Subscribers: MatzeB, qcolombet, nhaehnle

Differential Revision: https://reviews.llvm.org/D40308

Change-Id: I1d2b2b4beea802fce11da01edf71feb2064aab05
llvm-svn: 337273
2018-07-17 12:38:39 +00:00
Simon Pilgrim e4d12bb2d6 [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT
If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops.

Differential Revision: https://reviews.llvm.org/D49262

llvm-svn: 337258
2018-07-17 09:45:35 +00:00
Sanjay Patel c71adc8040 [Intrinsics] define funnel shift IR intrinsics + DAG builder support
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html

We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, 
and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation 
by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" 
operation which may also be a single hardware instruction.

Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working 
on that (much larger patch), but then I concluded that we don't need it. At least as a first 
step, we have all of the backend support necessary to match these ops...because it was required. 
And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are 
likely all that we'll ever need.

There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes
(along with improving the oversized shift documentation). Again, I don't think that's strictly 
necessary (as the test results here prove). That can be an efficiency improvement as a small 
follow-up patch.

So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. 

Differential Revision: https://reviews.llvm.org/D49242

llvm-svn: 337221
2018-07-16 22:59:31 +00:00
Fangrui Song cb0bab86b3 [CodeGen] Fix inconsistent declaration parameter name
llvm-svn: 337200
2018-07-16 18:51:40 +00:00
Mandeep Singh Grang 20239b18bb [llvm] Change 2 instances of std::sort to llvm::sort
llvm-svn: 337192
2018-07-16 17:26:37 +00:00
Wei Mi 40c4aa7637 [RegAlloc] Skip global splitting if the live range is huge and its spill is
trivially rematerializable.

We run into a case where machineLICM hoists a large number of live ranges
outside of a big loop because it thinks those live ranges are trivially
rematerializable. In regalloc, global splitting is tried out first for those
live ranges before they are spilled and rematerialized. Because the global
splitting algorithm is quadratic, increasing a lot of global splitting
candidates causes huge compile time increase (50s to 1400s on my local
machine when compiling a module).

However, we think for live ranges which are very large and are trivially
rematerialiable, it is better to just skip global splitting so as to save
compile time with little chance of sacrificing performance.  We uses the
segment size of live range to indirectly evaluate whether the global
splitting of the live range can introduce high cost, and use an option
as a knob to adjust the size limit threshold.

Differential Revision: https://reviews.llvm.org/D49353

llvm-svn: 337186
2018-07-16 15:42:20 +00:00
Roman Lebedev de506632aa [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' pattern
Summary:

[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]

As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.

But the IR-optimal patter does not lower efficiently, so we want to undo it..

This handles the simple pattern.
There is a second pattern with predicate and constants inverted.

NOTE: we do not check uses here. we always do the transform.

Reviewers: spatel, craig.topper, RKSimon, javed.absar

Reviewed By: spatel

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D49266

llvm-svn: 337166
2018-07-16 12:44:10 +00:00
Daniel Cederman c3d8002c2e Avoid losing Hi part when expanding VAARG nodes on big endian machines
Summary:
If the high part of the load is not used the offset to the next element
will not be set correctly.

For example, on Sparc V8, the following code will read val2 from offset 4
instead of 8.

```
int val = __builtin_va_arg(va, long long);
int val2 = __builtin_va_arg(va, int);
```

Reviewers: jyknight

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48595

llvm-svn: 337161
2018-07-16 12:14:17 +00:00
Jonas Devlieghere 98062cb396 [AccelTable] Provide DWARF5AccelTableStaticData for dsymutil.
For dsymutil we want to store offsets in the accelerator table entries
rather than DIE pointers. In addition, we need a way to communicate
which CU a DIE belongs to. This patch provides support for both of these
issues.

Differential revision: https://reviews.llvm.org/D49102

llvm-svn: 337158
2018-07-16 10:52:27 +00:00
Michael J. Spencer 7bb2767fba Recommit r335794 "Add support for generating a call graph profile from Branch Frequency Info." with fix for removed functions.
llvm-svn: 337140
2018-07-16 00:28:24 +00:00
Sanjay Patel 79a423cfa2 [DAGCombiner] fix typo in comment; NFC
llvm-svn: 337132
2018-07-15 17:09:35 +00:00
Sanjay Patel a41c886c55 [DAGCombiner] extend(ifpositive(X)) -> shift-right (not X)
This is almost the same as an existing IR canonicalization in instcombine, 
so I'm assuming this is a good early generic DAG combine too.

The motivation comes from reduced bit-hacking for select-of-constants in IR 
after rL331486. We want to restore that functionality in the DAG as noted in
the commit comments for that change and the llvm-dev discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html

The PPC and AArch tests show that those targets are already doing something 
similar. x86 will be neutral in the minimal case and generally better when 
this pattern is extended with other ops as shown in the signbit-shift.ll tests.

Note the asymmetry: we don't include the (extend (ifneg X)) transform because 
it already exists in SimplifySelectCC(), and that is verified in the later 
unchanged tests in the signbit-shift.ll files. Without the 'not' op, the 
general transform to use a shift is always a win because that's a single 
instruction.

Alive proofs:
https://rise4fun.com/Alive/ysli

Name: if pos, get -1
  %c = icmp sgt i16 %x, -1
  %r = sext i1 %c to i16
  =>
  %n = xor i16 %x, -1
  %r = ashr i16 %n, 15

Name: if pos, get 1
  %c = icmp sgt i16 %x, -1
  %r = zext i1 %c to i16
  =>
  %n = xor i16 %x, -1
  %r = lshr i16 %n, 15

Differential Revision: https://reviews.llvm.org/D48970

llvm-svn: 337130
2018-07-15 16:27:07 +00:00
Francis Visoiu Mistrih f905bf1408 [MachineOutliner] Check the last instruction from the sequence when updating liveness
The MachineOutliner was doing an std::for_each from the call (inserted
before the outlined sequence) to the iterator at the end of the
sequence.

std::for_each needs the iterator past the end, so the last instruction
was not taken into account when propagating the liveness information.

This fixes the machine verifier issue in machine-outliner-disubprogram.ll.

Differential Revision: https://reviews.llvm.org/D49295

llvm-svn: 337090
2018-07-14 09:40:01 +00:00
Chandler Carruth 90358e1ef1 [SLH] Introduce a new pass to do Speculative Load Hardening to mitigate
Spectre variant #1 for x86.

There is a lengthy, detailed RFC thread on llvm-dev which discusses the
high level issues. High level discussion is probably best there.

I've split the design document out of this patch and will land it
separately once I update it to reflect the latest edits and updates to
the Google doc used in the RFC thread.

This patch is really just an initial step. It isn't quite ready for
prime time and is only exposed via debugging flags. It has two major
limitations currently:
1) It only supports x86-64, and only certain ABIs. Many assumptions are
   currently hard-coded and need to be factored out of the code here.
2) It doesn't include any options for more fine-grained control, either
   of which control flow edges are significant or which loads are
   important to be hardened.
3) The code is still quite rough and the testing lighter than I'd like.

However, this is enough for people to begin using. I have had numerous
requests from people to be able to experiment with this patch to
understand the trade-offs it presents and how to use it. We would also
like to encourage work to similar effect in other toolchains.

The ARM folks are actively developing a system based on this for
AArch64. We hope to merge this with their efforts when both are far
enough along. But we also don't want to block making this available on
that effort.

Many thanks to the *numerous* people who helped along the way here. For
this patch in particular, both Eric and Craig did a ton of review to
even have confidence in it as an early, rough cut at this functionality.

Differential Revision: https://reviews.llvm.org/D44824

llvm-svn: 336990
2018-07-13 11:13:58 +00:00
Petar Jovanovic be2e80af12 [LiveDebugValues] Tracking copying value between registers
During the execution of long functions or functions that have a lot of
inlined code it could come to the situation where tracked value could be
transferred from one register to another. The transfer is recognized only if
destination register is a callee saved register and if source register is
killed. We do not salvage caller-saved registers since there is a great
chance that killed register would outlive it.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D44016

llvm-svn: 336978
2018-07-13 08:24:26 +00:00
Matthias Braun 90ad6835dd CodeGen: Remove pipeline dependencies on StackProtector; NFC
This re-applies r336929 with a fix to accomodate for the Mips target
scheduling multiple SelectionDAG instances into the pass pipeline.

PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.

PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.

This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.

Patch by Stephen Crane <sjc@immunant.com>

Differential Revision: https://reviews.llvm.org/D49256

llvm-svn: 336964
2018-07-13 00:08:38 +00:00
Matthias Braun f03f32d478 Revert "(HEAD -> master, origin/master, arcpatch-D37582) CodeGen: Remove pipeline dependencies on StackProtector; NFC"
This was triggering pass scheduling failures.

This reverts commit r336929.

llvm-svn: 336934
2018-07-12 19:27:01 +00:00
Matthias Braun 9436570cbd CodeGen: Remove pipeline dependencies on StackProtector; NFC
PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.

PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.

This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.

Patch by Stephen Crane <sjc@immunant.com>

Differential Revision: https://reviews.llvm.org/D49256

llvm-svn: 336929
2018-07-12 18:33:32 +00:00
Wolfgang Pieb fcf3810cf7 [DWARF v5] Generate range list tables into the .debug_rnglists section. No support for split DWARF
and no use of DW_FORM_rnglistx with the DW_AT_ranges attribute.

Reviewer: aprantl

Differential Revision: https://reviews.llvm.org/D49214

llvm-svn: 336927
2018-07-12 18:18:21 +00:00
Eli Friedman 0319c28459 [CodeGen] Emit more precise AssertZext/AssertSext nodes.
This is marginally helpful for removing redundant extensions, and the
code is easier to read, so it seems like an all-around win. In the new
test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an
AssertZext i2, which allows the extension of the return value to be
eliminated.

Differential Revision: https://reviews.llvm.org/D49004

llvm-svn: 336868
2018-07-11 23:26:35 +00:00
Krzysztof Parzyszek 0b492f7fb8 [CodeGen] Ignore debug uses in MachineCopyPropagation
Debug uses should not count as real uses, since the presence of debug
information could affect the generated code.

llvm-svn: 336803
2018-07-11 13:30:27 +00:00
Diogo N. Sampaio b0d85ef975 [NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdSt
Reuse this function as to test correctness and profitability of
reducing width of either load or store operations.

Reviewsers: samparker

Differential Revision: https://reviews.llvm.org/D48624

llvm-svn: 336800
2018-07-11 12:59:42 +00:00
Simon Pilgrim 075b04a55f [SelectionDAG] Add constant buildvector support to isKnownNeverZero
This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants).

llvm-svn: 336779
2018-07-11 09:56:41 +00:00
Simon Pilgrim df9d59771b [DAGCombiner] Support non-uniform X%C -> X-(X/C)*C folds
First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic.

We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit.

Differential Revision: https://reviews.llvm.org/D48975

llvm-svn: 336774
2018-07-11 09:22:42 +00:00
Simon Pilgrim 97cf111689 [DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) fold
llvm-svn: 336773
2018-07-11 09:14:37 +00:00
Simon Pilgrim 4cb4609392 [DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1
udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts.

llvm-svn: 336701
2018-07-10 16:33:07 +00:00
Jonas Devlieghere 3192e35b60 Revert "[AccelTable] Provide abstraction for emitting DWARF5 accelerator tables."
This reverts r336529 because an alternative approach turned out to be a
better fit for dsymuil.

llvm-svn: 336698
2018-07-10 16:18:56 +00:00
Simon Pilgrim 641097d561 [DAGCombiner] visitREM - call visitSDIVLike/visitUDIVLike directly to avoid recursive combining.
As suggested by @efriedma on D48975 use the visitSDIVLike/visitUDIVLike functions introduced at rL336656.

llvm-svn: 336664
2018-07-10 13:18:16 +00:00
Simon Pilgrim ce5c19b623 [DAGCombiner] Split SDIV/UDIV optimization expansions from the rest of the combines. NFCI.
As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM.

llvm-svn: 336656
2018-07-10 11:38:00 +00:00
Wolfgang Pieb e194f73e9f [DWARF][NFC] Refactor range list emission to use a static helper
This is prep for DWARF v5 range list emission. Emission of a single range list is moved
to a static helper function.

Reviewer: jdevlieghere

Differential Revision: https://reviews.llvm.org/D49098

llvm-svn: 336621
2018-07-10 00:10:11 +00:00
Mark Searles 7139dea6d9 RenameIndependentSubregs: Fix handling of undef tied operands
Ensure that, if updating a tied operand pair, to only update
that pair.

Differential Revision: https://reviews.llvm.org/D49052

llvm-svn: 336593
2018-07-09 20:07:03 +00:00
Daniel Sanders 9481399c0f [globalisel][irtranslator] Add support for atomicrmw and (strong) cmpxchg
Summary:
This patch adds support for the atomicrmw instructions and the strong
cmpxchg instruction to the IRTranslator.

I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what
difference it makes to the backend. As far as I can tell from the code, it
only matters to AtomicExpandPass which is run at the LLVM-IR level.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D40092

llvm-svn: 336589
2018-07-09 19:33:40 +00:00
Roman Lebedev 5ccae1750b [X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts.
Summary:
This adds a reverse transform for the instcombine canonicalizations
that were added in D47980, D47981.

As discussed later, that was worse at least for the code size,
and potentially for the performance, too.

https://rise4fun.com/Alive/Zmpl

Reviewers: craig.topper, RKSimon, spatel

Reviewed By: spatel

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D48768

llvm-svn: 336585
2018-07-09 19:06:42 +00:00
Craig Topper e3b0c7e5bd [SelectionDAG] Add VT consistency checks to the creation of ISD::FMA.
This is similar to what is done for binops. I don't know if this would have helped us catch the bug fixed in r336566 earlier or not, but I figured it couldn't hurt.

llvm-svn: 336576
2018-07-09 18:23:55 +00:00
Jonas Devlieghere 5e810a878d [AccelTable] Provide abstraction for emitting DWARF5 accelerator tables.
When emitting the DWARF accelerator tables from dsymutil, we don't have
a DwarfDebug instance and we use a custom class to represent Dwarf
compile units. This patch adds an interface AccelTableWriterInfo to
abstract these from the Dwarf5AccelTableWriter, so we can have a custom
implementation for this in dsymutil.

Differential revision: https://reviews.llvm.org/D49031

llvm-svn: 336529
2018-07-09 09:08:44 +00:00
Jonas Devlieghere e60ca777b6 [AccelTable] Dwarf5AccelTableEmitter -> Writer (NFC)
Renames Dwarf5AccelTableEmitter to Dwarf5AccelTableWriter as suggested
in D49031.

llvm-svn: 336525
2018-07-09 08:47:38 +00:00
Simon Pilgrim 23f9eddabe [SelectionDAG] Split float and integer isKnownNeverZero tests
Splits off isKnownNeverZeroFloat to handle +/- 0 float cases.

This will make it easier to be more aggressive with the integer isKnownNeverZero tests (similar to ValueTracking), use computeKnownBits etc.

Differential Revision: https://reviews.llvm.org/D48969

llvm-svn: 336492
2018-07-07 18:17:14 +00:00
Simon Pilgrim 0a36daa5a2 Use const APInt& to avoid extra copy. NFCI.
As discussed on D48825.

llvm-svn: 336491
2018-07-07 17:33:48 +00:00
Simon Pilgrim c1d1944053 [DAGCombiner] Add EXTRACT_SUBVECTOR to SimplifyDemandedVectorElts
As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR.

Differential Revision: https://reviews.llvm.org/D48825

llvm-svn: 336490
2018-07-07 17:30:06 +00:00
Vedant Kumar b3091da3af Use Type::isIntOrPtrTy where possible, NFC
It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() ||
T.isPointerTy()`.

I used Python's re.sub with this regex to update users:

  r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)'

llvm-svn: 336462
2018-07-06 20:17:42 +00:00
Nico Weber 038dbf3c24 Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.
llvm-svn: 336453
2018-07-06 17:37:24 +00:00
Vedant Kumar 6379a62250 [Local] replaceAllDbgUsesWith: Update debug values before RAUW
The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.

This improves upon the existing insertReplacementDbgValues API by:

- Updating debug intrinsics in-place, while preventing use-before-def of
  the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
  common utility code.

Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:

- The no-op conversion. Applies when the values have the same width, or
  have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
  old one.

Testing:

- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
  regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
  Debugify.

Differential Revision: https://reviews.llvm.org/D48676

llvm-svn: 336451
2018-07-06 17:32:39 +00:00
Diogo N. Sampaio 17be994942 Added missing semicolon
llvm-svn: 336428
2018-07-06 10:09:04 +00:00
Diogo N. Sampaio 742bf1a255 [SelectionDAG] https://reviews.llvm.org/D48278
D48278

Allow to reduce redundant shift masks.
For example:
x1 = x & 0xAB00
x2 = (x >> 8) & 0xAB

can be reduced to:
x1 = x & 0xAB00
x2 = x1 >> 8
It only allows folding when the masks and shift values are constants.

llvm-svn: 336426
2018-07-06 09:42:25 +00:00
Diogo N. Sampaio 734cfd11fe Testing commit permision
llvm-svn: 336384
2018-07-05 18:49:32 +00:00
Yvan Roux eaececf5e0 [MachineOutliner] Fix typo in getOutliningCandidateInfo function name
getOutlininingCandidateInfo -> getOutliningCandidateInfo

Differential Revision: https://reviews.llvm.org/D48867

llvm-svn: 336285
2018-07-04 15:37:08 +00:00
Max Kazantsev e8e01143ec [ImplicitNullChecks] Check for rewrite of register used in 'test' instruction
The following code pattern:

       mov %rax, %rcx
       test %rax, %rax
       %rax = ....
       je  throw_npe
       mov(%rcx), %r9
       mov(%rax), %r10

gets transformed into the following incorrect code after implicit null check pass:
        mov %rax, %rcx
       %rax = ....
       faulting_load_op("movl (%rax), %r10", throw_npe)
       mov(%rcx), %r9

For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization.

Patch by Surya Kumari Jangala!

Differential Revision: https://reviews.llvm.org/D48627
Reviewed By: skatkov

llvm-svn: 336241
2018-07-04 08:01:26 +00:00
Simon Pilgrim 74cc4cfa94 [DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen
Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code

llvm-svn: 336198
2018-07-03 14:11:32 +00:00
David Stenberg 23bba56fce [CodeGen] Make block removal order deterministic in CodeGenPrepare
Summary:
Replace use of a SmallPtrSet with a SmallSetVector to make the worklist
iteration order deterministic. This is done as the order the blocks are
removed may affect whether or not PHI nodes in successor blocks are
removed.

For example, consider the following case where %bb1 and %bb2 are
removed:

    bb1:
      br i1 undef, label %bb3, label %bb4
    bb2:
      br i1 undef, label %bb4, label %bb3
    bb3:
      pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ]
      br label %bb4
    bb4:
      pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ],
                     [ pv1, %bb3 ], [ v0, %other ]

If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2
to pv1 will be removed before %bb1 is removed as a predecessor to %bb4.
The pv1 node will thus be optimized out (to v0) at the time %bb1 is
removed as a predecessor to %bb4, leaving the blocks as following when
the incoming value from %bb1 has been removed:

    bb3: ; pv1 optimized out, incoming value to pv2 is v0
      br label %bb4
    bb4:
      pv2 = phi type [ v0, %bb3 ], [ v0, %other ]

The pv2 PHI node will be optimized away by removePredecessor() as all
incoming values are identical.

In case %bb2 is removed after %bb1, pv1 will not be optimized out at the
time %bb2 is removed as a predecessor to %bb4, leaving the blocks as
following when the incoming value from %bb2 to pv2 has been removed:

    bb3:
      pv1 = phi type [ undef, %bb2 ], [ v0, %other ]
      br label %bb4
    bb4:
      pv2 = phi type [ pv1, %bb3 ], [ v0, %other ]

The pv2 PHI node will thus not be removed in this case, ultimately
leading to the following output

    bb3: ; pv1 optimized out, incoming value to pv2 is v0
      br label %bb4
    bb4:
      pv2 = phi type [ v0, %bb3 ], [ v0, %other ]

I have not looked into changing DeleteDeadBlock() so that the redundant
PHI nodes are removed.

I have not added a test case, as I was not able to create a particularly
small and (not messy) reproducer. This is likely due to SmallPtrSet
behaving deterministically when in small mode.

Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle

Reviewed By: fhahn

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D48369

llvm-svn: 336109
2018-07-02 14:23:48 +00:00
Piotr Padlewski 5b3db45e8f Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336073
2018-07-02 04:49:30 +00:00
Simon Pilgrim fae337704e [DAGCombiner] Handle correctly non-splat power of 2 -1 divisor (PR37119)
The combine added in commit 329525 overlooked the case where one, but not all, of the divisor elements is -1, -1 is the only power of two value for which the sdiv expansion recipe breaks.

Thanks to @zvi for the original patch.

Differential Revision: https://reviews.llvm.org/D45806

llvm-svn: 336048
2018-06-30 12:22:55 +00:00
Jessica Paquette 8bda1881ca [MachineOutliner] Add support for target-default outlining.
This adds functionality to the outliner that allows targets to
specify certain functions that should be outlined from by default.

If a target supports default outlining, then it specifies that in
its TargetOptions. In the case that it does, and the user hasn't
specified that they *never* want to outline, the outliner will
be added to the pass pipeline and will run on those default functions.

This is a preliminary patch for turning the outliner on by default
under -Oz for AArch64.

https://reviews.llvm.org/D48776

llvm-svn: 336040
2018-06-30 03:56:03 +00:00
Jessica Paquette 79917b9686 [MachineOutliner] Add always and never options to -enable-machine-outliner
This is a recommit of r335887, which was erroneously committed earlier.

To enable the MachineOutliner by default on AArch64, we need to be able to
disable the MachineOutliner and also provide an option to "always" enable the
outliner.

This adds that capability. It allows the user to still use the old
-enable-machine-outliner option, which defaults to "always". This is building
up to allowing the user to specify "always" versus the target default
outlining behaviour.

https://reviews.llvm.org/D48682

llvm-svn: 335986
2018-06-29 16:12:45 +00:00
Alexey Bataev 2a03d4296a [DEBUG_INFO, NVPTX] Do not emit .debug_loc section.
Summary:
.debug_loc section is not supported for NVPTX target. If there is an
object whose location can change during its lifetime, we do not generate
debug location info for this variable.

Reviewers: echristo

Subscribers: jholewinski, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48730

llvm-svn: 335976
2018-06-29 14:23:28 +00:00
Jessica Paquette 0c5d3ffbb8 [MachineOutliner] Never add the outliner in -O0
This is a recommit of r335879.

We shouldn't add the outliner when compiling at -O0 even if
-enable-machine-outliner is passed in. This makes sure that we
don't add it in this case.

This also removes -O0 from the outliner DWARF test.

llvm-svn: 335930
2018-06-28 21:49:24 +00:00
Martin Storsjo 2a9bd7b756 [COFF] Fix constant sharing regression for MinGW
This fixes a regression since SVN r334523, where the object files
built targeting MinGW were rejected by GNU binutils tools. Prior to
that commit, we only put constants in comdat for MSVC configurations.

Differential Revision: https://reviews.llvm.org/D48567

llvm-svn: 335918
2018-06-28 20:28:29 +00:00
Jessica Paquette dafa198c96 [MachineOutliner] Define MachineOutliner support in TargetOptions
Targets should be able to define whether or not they support the outliner
without the outliner being added to the pass pipeline. Before this, the
outliner pass would be added, and ask the target whether or not it supports the
outliner.

After this, it's possible to query the target in TargetPassConfig, before the
outliner pass is created. This ensures that passing -enable-machine-outliner
will not modify the pass pipeline of any target that does not support it.

https://reviews.llvm.org/D48683

llvm-svn: 335887
2018-06-28 17:45:43 +00:00
Simon Pilgrim 9c70d48cb2 [DAGCombiner] Ensure we use the correct CC result type in visitSDIV (REAPPLIED)
We could get away with it for constant folded cases, but not for rL335719.

Thanks to Krzysztof Parzyszek for noticing.

Reapply original commit rL335821 which was reverted at rL335871 due to a WebAssembly bug that was fixed at rL335884.

llvm-svn: 335886
2018-06-28 17:33:41 +00:00
Jessica Paquette d6261bef7b Revert "[MachineOutliner] Add always and never options to -enable-machine-outliner"
I accidentally committed this instead of D48683 because I haven't had coffee
yet.

llvm-svn: 335883
2018-06-28 17:26:19 +00:00
Jessica Paquette f3a44fe833 Revert "[MachineOutliner] Never add the outliner in -O0"
This reverts commit 9c7c10e4073a0bc6a759ce5cd33afbac74930091.

It relies on r335872 since that introduces the machine outliner
flags test. I meant to commit D48683 in that commit, but got mixed
up and committed D48682 instead. So, I'm reverting this and
r335872, since D48682 hasn't made it through review yet.

llvm-svn: 335882
2018-06-28 17:26:18 +00:00
Jessica Paquette c9d675266e [MachineOutliner] Never add the outliner in -O0
We shouldn't add the outliner when compiling at -O0 even if
-enable-machine-outliner is passed in. This makes sure that we
don't add it in this case.

This also updates machine-outliner-flags to reflect the change
and improves the comment describing what that test does.

llvm-svn: 335879
2018-06-28 17:05:57 +00:00
Matthias Braun da5e7e11d1 SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O)
Add NoTrapAfterNoreturn target option which skips emission of traps
behind noreturn calls even if TrapUnreachable is enabled.

Enable the feature on Mach-O to save code size; Comments suggest it is
not possible to enable it for the other users of TrapUnreachable.

rdar://41530228

DifferentialRevision: https://reviews.llvm.org/D48674
llvm-svn: 335877
2018-06-28 17:00:45 +00:00
Jessica Paquette 1ccb66c5fb [MachineOutliner] Add always and never options to -enable-machine-outliner
To enable the MachineOutliner by default on AArch64, we need to be able to
disable the MachineOutliner and also provide an option to "always" enable the
outliner.

This adds that capability. It allows the user to still use the old
-enable-machine-outliner option, which defaults to "always". This is building
up to allowing the user to specify "always" versus the target-default
outlining behaviour.

llvm-svn: 335872
2018-06-28 16:39:42 +00:00
Haojian Wu 2103990e63 Revert "[DAGCombiner] Ensure we use the correct CC result type in visitSDIV"
This reverts commit r335821.

This crashes the webassembly test, run "ninja check-llvm-codegen-webassembly" to reproduce.

llvm-svn: 335871
2018-06-28 16:25:57 +00:00
Benjamin Kramer 269eb21e1c Revert "Add support for generating a call graph profile from Branch Frequency Info."
This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost.

llvm-svn: 335851
2018-06-28 13:15:03 +00:00
Simon Pilgrim abebe4c746 [DAGCombiner] Ensure we use the correct CC result type in visitSDIV
We could get away with it for constant folded cases, but not for rL335719.

Thanks to Krzysztof Parzyszek for noticing.

llvm-svn: 335821
2018-06-28 09:54:28 +00:00
Simon Pilgrim 49cb65bb7b [DAGCombiner] Remove unused variable. NFCI.
Noticed in D45806 review.

llvm-svn: 335817
2018-06-28 09:29:08 +00:00
Petar Jovanovic d175aeb881 [DwarfDebug] Remove unused argument (NFC)
Remove unused ByteStreamer argument from function emitDebugLocValue.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D48590

llvm-svn: 335811
2018-06-28 04:50:40 +00:00
Michael J. Spencer 5bf1ead377 Add support for generating a call graph profile from Branch Frequency Info.
=== Generating the CG Profile ===

The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions.  For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.

After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}

!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```

Differential Revision: https://reviews.llvm.org/D48105

llvm-svn: 335794
2018-06-27 23:58:08 +00:00
Nirav Dave 7c57ae57a8 [DAGCombine] Disable TokenFactor simplifications when optnone.
llvm-svn: 335773
2018-06-27 19:41:25 +00:00
Daniel Sanders bdeb880d14 [globalisel][legalizer] Add AtomicOrdering to LegalityQuery and use it in AArch64
Now that we have the ability to legalize based on MMO's. Add support for
legalizing based on AtomicOrdering and use it to correct the legalization
of the atomic instructions.

Also extend all() to be a variadic template as this ruleset now requires
3 and 4 argument versions.

llvm-svn: 335767
2018-06-27 19:03:21 +00:00
Sanjay Patel d052de856d [DAGCombiner] restrict (float)((int) f) --> ftrunc with no-signed-zeros
As noted in the D44909 review, the transform from (fptosi+sitofp) to ftrunc 
can produce -0.0 where the original code does not:

#include <stdio.h>
  
int main(int argc) {
  float x;
  x = -0.8 * argc;
  printf("%f\n", (float)((int)x));
  return 0;
}

$ clang -O0 -mavx fp.c ; ./a.out 
0.000000
$ clang -O1 -mavx fp.c ; ./a.out 
-0.000000

Ideally, we'd use IR/node flags to predicate the transform, but the IR parser 
doesn't currently allow fast-math-flags on the cast instructions. So for now, 
just use the function attribute that corresponds to clang's "-fno-signed-zeros" 
option.

Differential Revision: https://reviews.llvm.org/D48085

llvm-svn: 335761
2018-06-27 18:16:40 +00:00
Jessica Paquette f472f6159a [MachineOutliner] Don't outline sequences where x16/x17/nzcv are live across
It isn't safe to outline sequences of instructions where x16/x17/nzcv live
across the sequence.

This teaches the outliner to check whether or not a specific canidate has
x16/x17/nzcv live across it and discard the candidate in the case that that is
true.

https://bugs.llvm.org/show_bug.cgi?id=37573
https://reviews.llvm.org/D47655

llvm-svn: 335758
2018-06-27 17:43:27 +00:00
Simon Pilgrim d3e583a52d [DAGCombiner] visitSDIV - add special case handling for (sdiv X, 1) -> X in pow2 expansion
For divisor = 1, perform a select of X - reduces scalarisation of simple SDIVs

llvm-svn: 335727
2018-06-27 12:45:31 +00:00
Simon Pilgrim e835f662fa [DAGCombiner] visitSDIV - simplify pow2 handling. NFCI.
Use the builtin constant folding of getNode() etc. instead of doing it manually.

llvm-svn: 335720
2018-06-27 10:51:55 +00:00
Simon Pilgrim dfbcc66adc [DAGCombiner] Fold SDIV(%X, MIN_SIGNED) -> SELECT(%X == MIN_SIGNED, 1, 0)
Fixes PR37569.

llvm-svn: 335719
2018-06-27 10:21:06 +00:00
Simon Pilgrim 0a566bc0ae [DAGCombiner] Don't accept signbit sdiv divisors in sdiv-by-pow2 vector expansion (PR37569)
llvm-svn: 335717
2018-06-27 09:41:22 +00:00
Sanjay Patel fb9c440ba5 [DAGCombiner] use isBitwiseNot to simplify code; NFC
llvm-svn: 335652
2018-06-26 19:46:56 +00:00
Simon Pilgrim 7f55af37f4 [DAGCombiner] Don't accept -1 sdiv divisors in sdiv-by-pow2 vector expansion (PR37119)
Temporary fix until I've managed to get D45806 updated - both +1 and -1 special cases need to be properly supported.

llvm-svn: 335637
2018-06-26 17:46:51 +00:00
Simon Pilgrim 133b1cdf08 [DAGCombiner] Pull out VT bitwidth in visitSDIV. NFCI.
llvm-svn: 335617
2018-06-26 15:39:16 +00:00
Krzysztof Parzyszek 9f199ebec0 Silence "unused variable" warning in LiveIntervals.cpp after r335607
llvm-svn: 335610
2018-06-26 14:55:04 +00:00
Krzysztof Parzyszek 70f027022c Account for undef values from predecessors in extendSegmentsToUses
It is legal for a PHI node not to have a live value in a predecessor
as long as the end of the predecessor is jointly dominated by an undef
value.

llvm-svn: 335607
2018-06-26 14:37:16 +00:00
Vedant Kumar b725c69f12 [SelectionDAG] Remove debug locations from ConstantSD(FP)Nodes
This removes debug locations from ConstantSDNode and ConstantSDFPNode.

When this kind of node is materialized we no longer create a line table
entry which jumps back to the constant's first point of use. This makes
single-stepping behavior smoother, and it matches the model used by IR,
where Constants have no locations. See this thread for more context:

  http://lists.llvm.org/pipermail/llvm-dev/2018-June/124164.html

I'd like to handle constant BuildVectorSDNodes and to try to eliminate
passing SDLocs to SelectionDAG::getConstant*() in follow-up commits.

Differential Revision: https://reviews.llvm.org/D48468

llvm-svn: 335497
2018-06-25 17:06:18 +00:00
Matt Arsenault 921f7a27cc StackSlotColoring: Decide colors per stack ID
I thought I fixed this in r308673, but that fix was
very broken. The assumption that any frame index can be used
in place of another was more widespread than I realized.
Even when stack slot sharing was disabled, this was still
replacing frame index uses with a different ID with a different
stack slot.

Really fix this by doing the coloring per-stack ID, so all of
the coloring logically done in a separate namespace. This is a lot
simpler than trying to figure out how to change the color if
the stack ID is different.

llvm-svn: 335488
2018-06-25 16:05:55 +00:00
Krzysztof Parzyszek 4581f37e7c Improve handling of COPY instructions with identical value numbers
Testcases provided by Tim Renouf.

Differential Revision: https://reviews.llvm.org/D48102

llvm-svn: 335472
2018-06-25 13:46:41 +00:00
Artur Pilipenko ddc7f391d2 Revert change 335077 "[InlineSpiller] Fix a crash due to lack of forward progress from remat specifically for STATEPOINT"
This change caused widespread assertion failures in our downstream testing:
lib/CodeGen/LiveInterval.cpp:409: bool llvm::LiveRange::overlapsFrom(const llvm::LiveRange&, llvm::LiveRange::const_iterator) const: Assertion `!empty() && "empty range"' failed.

llvm-svn: 335462
2018-06-25 12:58:13 +00:00
Simon Pilgrim 5b6b500687 Fix -Wparentheses gcc warning. NFCI.
llvm-svn: 335451
2018-06-25 11:19:05 +00:00
Sanjay Patel 962ee178fa [DAGCombiner] eliminate setcc bool math when input is low-bit of some value
This patch has the same motivating example as D48466:
define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) {
    %c.0282 = and i32 %c.0282.in, 268435455
    %a16 = lshr i64 32508, %x
    %a17 = and i64 %a16, 1
    %tobool = icmp eq i64 %a17, 0
    %. = select i1 %tobool, i32 1, i32 2
    %.286 = select i1 %tobool, i32 27, i32 26
    %shr97 = lshr i32 %c.0282, %.
    %shl98 = shl i32 %c.0282.in, %.286
    %or99 = or i32 %shr97, %shl98
    %shr100 = lshr i32 %d.0280, %.
    %shl101 = shl i32 %d.0280, %.286
    %or102 = or i32 %shr100, %shl101
    store i32 %or99, i32* %ptr0
    store i32 %or102, i32* %ptr1
    ret void
}

...but I'm trying to kill the setcc bool math sooner rather than later.

By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, 
we can create a universally good fold because we always eliminate the condition code 
intermediate value.

Here are Alive proofs for these (currently instcombine folds the 'add' variants, but 
misses the 'sub' patterns):
https://rise4fun.com/Alive/Gsyp

Name: sub of zext cmp mask
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %z = zext i1 %c to i32
  %r = sub i32 C1, %z
  =>
  %optional_cast = zext i8 %a to i32
  %r = add i32 %optional_cast, C1-1

Name: add of zext cmp mask
  %a = and i32 %x, 1
  %c = icmp eq i32 %a, 0
  %z = zext i1 %c to i8
  %r = add i8 %z, C1
  =>
  %optional_cast = trunc i32 %a to i8
  %r = sub i8 C1+1, %optional_cast

All of the tests look like improvements or neutral to me. But it is possible that x86 
test+set+bitop is better than what we now show here. I suspect we could do better by 
adding another fold for the 'sub' variants.

We start with select-of-constant in IR in the larger motivating test, so that's why I 
included tests with selects. Proofs for those variants:
https://rise4fun.com/Alive/Bx1

Name: true const is bigger
Pre: C2 == (C1 + 1)
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %r = select i1 %c, i64 C2, i64 C1
  =>
  %z = zext i8 %a to i64
  %r = sub i64 C2, %z

Name: false const is bigger
Pre: C2 == (C1 + 1)
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %r = select i1 %c, i64 C1, i64 C2
  =>
  %z = zext i8 %a to i64
  %r = add i64 C1, %z

Differential Revision: https://reviews.llvm.org/D48466

llvm-svn: 335433
2018-06-24 14:37:30 +00:00
Krzysztof Parzyszek 358a916aa8 Initialize LiveRegs once in BranchFolder::mergeCommonTails
llvm-svn: 335365
2018-06-22 16:38:38 +00:00
George Rimar dcf59c5480 Recommit r335333 "[MC] - Add .stack_size sections into groups and link them with .text"
With compilation fix.

Original commit message:

D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335336
2018-06-22 10:53:47 +00:00
George Rimar 6d448da1be Revert r335332 "[MC] - Add .stack_size sections into groups and link them with .text"
It broke bots.

http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443
http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551

llvm-svn: 335333
2018-06-22 10:27:33 +00:00
George Rimar e14485a0c6 [MC] - Add .stack_size sections into groups and link them with .text
D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335332
2018-06-22 10:10:53 +00:00
Chandler Carruth aa5f4d2e23 Revert r335306 (and r335314) - the Call Graph Profile pass.
This is the first pass in the main pipeline to use the legacy PM's
ability to run function analyses "on demand". Unfortunately, it turns
out there are bugs in that somewhat-hacky approach. At the very least,
it leaks memory and doesn't support -debug-pass=Structure. Unclear if
there are larger issues or not, but this should get the sanitizer bots
back to green by fixing the memory leaks.

llvm-svn: 335320
2018-06-22 05:33:57 +00:00
Michael J. Spencer fc93dd8e18 [Instrumentation] Add Call Graph Profile pass
This patch adds support for generating a call graph profile from Branch Frequency Info.

The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.

After scanning all the functions, it generates an appending module flag containing the data. The format looks like:

!llvm.module.flags = !{!0}

!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}

Differential Revision: https://reviews.llvm.org/D48105

llvm-svn: 335306
2018-06-21 23:31:10 +00:00
Reid Kleckner 2ef486690c [X86] Fix 32-bit mingw comdat names, only add one underscore
llvm-svn: 335304
2018-06-21 23:06:33 +00:00
Reid Kleckner 13c9ee684c [mingw] Fix GCC ABI compatibility for comdat things
Summary:
GCC and the binutils COFF linker do comdats differently from MSVC.
If we want to be ABI compatible, we have to do what they do, which is to
emit unique section names like ".text$_Z3foov" instead of short section
names like ".text". Otherwise, the binutils linker gets confused and
reports multiple definition errors when two object files from GCC and
Clang containing the same inline function are linked together.

The best description of the issue is probably at
https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to
have a good one in our tracker.

I fixed up the .pdata and .xdata sections needed everywhere other than
32-bit x86. GCC doesn't use associative comdats for those, it appears to
rely on the section name.

Reviewers: smeenai, compnerd, mstorsjo, martell, mati865

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D48402

llvm-svn: 335286
2018-06-21 20:27:38 +00:00
Matt Davis d041f21810 [DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine Sink
Summary:
The logic for handling the sinking of COPY instructions was generating
different code when building with debug flags.

The original code did not take into consideration debug instructions.  This
resulted in the registers in the DBG_VALUE instructions being treated as used,
and prevented the COPY from being sunk.  This patch avoids analyzing debug
instructions when trying to sink COPY instructions.

This patch also creates a routine from the code in MachineSinking::SinkInstruction to
perform the logic of sinking an instruction along with its debug instructions.
This functionality is used in multiple places, including the code for sinking COPY instrs.


Reviewers: junbuml, javed.absar, MatzeB, bjope

Reviewed By: bjope

Subscribers: aprantl, probinson, thegameg, jonpa, bjope, vsk, kristof.beyls, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D45637

llvm-svn: 335264
2018-06-21 17:59:52 +00:00
Stanislav Mekhanoshin 22ee191c3e DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1"
Allowed folding for "and/or" binops with non-constant operand if
arguments of select are 0/-1 values.

Normally this code with "and" opcode does not get to a DAG combiner
and simplified yet in the InstCombine. However AMDGPU produces it
during lowering and InstCombine has no chance to optimize it out.

In turn the same pattern with "or" opcode can reach DAG.

Differential Revision: https://reviews.llvm.org/D48301

llvm-svn: 335250
2018-06-21 16:02:05 +00:00
Krzysztof Parzyszek 6a3229301f [CodeGen] Avoid handling DBG_VALUE in LiveRegUnits::stepBackward
Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D48420

llvm-svn: 335233
2018-06-21 13:38:43 +00:00
Mikael Holmen 42f7bc96dd [DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property
Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.

This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.

Patch by: JesperAntonsson

Reviewers: aprantl, rnk, echristo, javed.absar

Reviewed By: aprantl

Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D48319

llvm-svn: 335214
2018-06-21 10:03:34 +00:00
David Green a465188500 [DAGCombine] Fix alignment for offset loads/stores
The alignment parameter to getExtLoad is treated as a base alignment,
not the alignment of the load (base + offset). When we infer a better
alignment for a Ptr we need to ensure that it applies to the base to
prevent the alignment on the load from being wrong.

This fixes a bug where the alignment could then be used to incorrectly
prove noalias between a load and a store, leading to a miscompile.

Differential Revision: https://reviews.llvm.org/D48029

llvm-svn: 335210
2018-06-21 08:30:07 +00:00
Eric Christopher 76cfef46f3 Add some explanatory text to the associated symbol support.
llvm-svn: 335207
2018-06-21 07:15:14 +00:00
Mikael Holmen 57b33f6aac [DebugInfo] Keep DBG_VALUE undef in LiveDebugVariables
Summary:
Fixes PR36579.

For cases where we had e.g.

 DBG_VALUE 42
 [...]
 DBG_VALUE undef

LiveDebugVariables would discard all undef DBG_VALUEs and then it would
look like the variable had the value 42 throughout the rest of the
function, which is incorrect.

With this patch we don't remove all undef DBG_VALUEs in LiveDebugVariables
so they will be kept after register allocation just like other DBG_VALUEs
which will yield more correct debug information.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: bjope, Ka-Ka, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48277

llvm-svn: 335205
2018-06-21 07:02:46 +00:00
Alina Sbirlea dfd14adeb0 Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.

2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
  i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
  ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
  - Since Pred is not deleted (BB is), the entry block does not need updating.
  - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  - adding some test owner as subscribers for the interesting tests modified:
    test/CodeGen/X86/avx-cmp.ll
    test/CodeGen/AMDGPU/nested-loop-conditions.ll
    test/CodeGen/AMDGPU/si-annotate-cf.ll
    test/CodeGen/X86/hoist-spill.ll
    test/CodeGen/X86/2006-11-17-IllegalMove.ll

3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar

Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D48202

llvm-svn: 335183
2018-06-20 22:01:04 +00:00
Stanislav Mekhanoshin 20279dc025 Allow binop C1, (select cc, CF, CT) -> select folding
Previously this folding was done only if select is a first operand.
However, for non-commutative operations constant may go before
select.

Differential Revision: https://reviews.llvm.org/D48223

llvm-svn: 335167
2018-06-20 20:24:20 +00:00
Bjorn Pettersson 7bf676662a [DAG] Don't map a TableId to itself in the ReplacedValues map
Summary:
Found some regressions (infinite loop in DAGTypeLegalizer::RemapId)
after r334880. This patch makes sure that we do map a TableId to
itself.

Reviewers: niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48364

llvm-svn: 335141
2018-06-20 16:06:09 +00:00
Nirav Dave cd558887d3 [DAG] Fix and-mask folding when narrowing loads.
Summary:
Check that and masks are strictly smaller than implicit mask from
narrowed load.

Fixes PR37820.

Reviewers: samparker, RKSimon, nemanjai

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48335

llvm-svn: 335137
2018-06-20 15:36:29 +00:00
Hiroshi Inoue c73b6d6bf7 [NFC] fix trivial typos in comments
llvm-svn: 335096
2018-06-20 05:29:26 +00:00
Craig Topper ddd88a559f [DAGCombiner] Add some comments to some true/false arguments to make it obvious what they are. NFC
llvm-svn: 335095
2018-06-20 04:32:07 +00:00
Craig Topper 59da74370b [SelectionDAG] Don't crash on inline assembly errors when the inline assembly return type is a struct.
Summary:
If we get an error building the SelectionDAG for inline assembly we try to continue and still build the DAG.

But if the return type for the inline assembly is a struct we end up crashing because we try to create an UNDEF node with a struct type which isn't valid.

Instead we need to create an UNDEF for each element of the struct and join them with merge_values.

This patch relies on single operand merge_values being handled gracefully by getMergeValues. If the return type is void there will be no VTs returned by ComputeValueVTs and now we just return instead of calling setValue. Hopefully that's ok, I assumed nothing would need to look up the mapped value for void node.

Fixes PR37359

Reviewers: rengolin, rovka, echristo, efriedma, bogner

Reviewed By: efriedma

Subscribers: craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D46560

llvm-svn: 335093
2018-06-20 04:32:05 +00:00
Philip Reames 8befa295ea [InlineSpiller] Fix a crash due to lack of forward progress from remat specifically for STATEPOINT
This patch covers up a fairly fundemental issue around remat and register allocation which shows up with psuedo instructions with more vreg uses than there are physical registers.  This patch essentially just disables remat for STATEPOINTs which are the only case we've seen so far, but long term we need a better fix.

For STATEPOINTs specifically, this is a strict improvement.  It unblocks progress towards enabling a currently off-by-default mode which integrates deopt bundle operand lowering with register allocator spilling so that we end up with smaller stack sizes and more optimally placed spills.  Assming no other issues turn up during my next round of integration testing - which based on experience so far, is admittedly unlikely - we might finally be able to enable something I've been working towards in small bits and pieces for years now.  :)

For psuedo ops in general, there are a couple of ideas for a "proper fix" discussed on the bug, but I'm far enough outside my knowledge area to not be able to see any of them through to a successful conclusion.  If anyone wants to help out here, please do.  

Differential Revision: https://reviews.llvm.org/D41098

llvm-svn: 335077
2018-06-19 21:19:59 +00:00
Jessica Paquette 32de26d432 [MachineOutliner] NFC: Remove insertOutlinerPrologue, rename insertOutlinerEpilogue
insertOutlinerPrologue was not used by any target, and prologue-esque code was
beginning to appear in insertOutlinerEpilogue. Refactor that into one function,
buildOutlinedFrame.

This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue.

llvm-svn: 335076
2018-06-19 21:14:48 +00:00
Matt Davis a245c765a8 [MIRParser] Update a diagnostic message to use the correct register sigil. NFC
Summary:
Patch r323922 changed the sigil for physical registers to '$',  instead of '%'.
An error message was missed during this change, and reports the wrong sigil.
This patch corrects that diagnostic and the tests that check that error string.


Reviewers: zer0, bjope

Reviewed By: bjope

Subscribers: bjope, thegameg, plotfi, llvm-commits

Differential Revision: https://reviews.llvm.org/D48086

llvm-svn: 335066
2018-06-19 18:39:40 +00:00
Heejin Ahn 33c3fce592 [WebAssembly] Add WasmEHFuncInfo for unwind destination information
Summary:
Add WasmEHFuncInfo and routines to calculate and fill in this struct to
keep track of unwind destination information. This will be used in
other EH related passes.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D48263

llvm-svn: 335005
2018-06-19 00:26:39 +00:00
Michael Berg 7b993d762f Utilize new SDNode flag functionality to expand current support for fadd
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed.

Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar

Reviewed By: spatel

Subscribers: wdng, nhaehnle

Differential Revision: https://reviews.llvm.org/D47909

llvm-svn: 334996
2018-06-18 23:44:59 +00:00
Michael Berg 932ba20af8 refactor of visitFADD for AllowNewConst cases
Summary: Refactoring for all constant cases which require AllowNewConst and some staging for future fmf usage.

Reviewers: spatel, hfinkel, wristow

Reviewed By: spatel

Subscribers: nhaehnle

Differential Revision: https://reviews.llvm.org/D48289

llvm-svn: 334984
2018-06-18 21:12:21 +00:00
Michael Berg cafe947445 [NFC] make MIFlag accessor functions consistant with usage model
llvm-svn: 334970
2018-06-18 18:37:48 +00:00
Krzysztof Parzyszek 546017322f Shrink interval after moving copy in removePartialRedundancy
llvm-svn: 334963
2018-06-18 17:16:39 +00:00
Nirav Dave d4ff2f8a74 Avoid needing to walk out legalization tables. NFCI.
Relanding after fixing expensive check from modifying tables.

To avoid redundant work, during DAG legalization we keep tables
mapping pre-legalized SDValues to post-legalized SDValues and a
SDValue-to-SDValue map to enable fast node replacements. However, as
the keys are nodes which may be reused it is possible that an entry in
a table refers to a now deleted node N (that should have been renamed
by the value replacement map) while a new node N' exists. If N' is
then replaced that entry would be wrong. Previously we avoided this by
when potentially violating this property, walking every table and
updating all node pointers. This is very expensive but hopefully rare
occurance.

This patch assigns each instance of a SDValue used in legalization a
unique id and uses these ids in the legalization tables. This avoids
any such aliasing issue, avoiding the full table search and allowing
more aggressive incremental table pruning.

In some cases this is a 1000x speedup to compilation.

Reviewers: jyknight, echristo, bogner, tra

Reviewed By: bogner

Subscribers: dberris, grandinj, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47959

llvm-svn: 334880
2018-06-16 02:51:29 +00:00
Michael Berg 8e570c3390 Utilize new SDNode flag functionality to expand current support for fma
Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions.

Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai

Reviewed By: rampitec, nhaehnle

Subscribers: tpr, nemanjai, wdng

Differential Revision: https://reviews.llvm.org/D47918

llvm-svn: 334876
2018-06-16 00:03:06 +00:00