Commit Graph

10118 Commits

Author SHA1 Message Date
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Amaury Sechet 4f729f6a67 [ARM] Add support for SETCCCARRY instead of SETCCE
Summary: As per title. SETCCE is deprecated and will eventually be removed.

Reviewers: rogfer01, efriedma, rengolin, javed.absar

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D46512

llvm-svn: 331929
2018-05-09 22:15:51 +00:00
Shiva Chen 801bf7ebbe [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

llvm-svn: 331844
2018-05-09 02:42:00 +00:00
Amaury Sechet f91b6a8cf7 [ARM] Select result 1 from ConvertBooleanCarryToCarryFlag's result automatically. NFC
The old behavior return the value 0, which is error prone.

llvm-svn: 331614
2018-05-07 01:43:42 +00:00
Craig Topper 781aa181ab Fix a bunch of places where operator-> was used directly on the return from dyn_cast.
Inspired by r331508, I did a grep and found these.

Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa.

llvm-svn: 331577
2018-05-05 01:57:00 +00:00
Tim Northover 28e0a6f7dd ARM: don't try to over-align large vectors as arguments.
By default LLVM thinks very large vectors get aligned to their size when
passed across functions. Unfortunately no-one told the ARM backend so it
doesn't trigger stack realignment and so accesses can cause the usual
misalignment issues (e.g. a data abort).

This changes the ABI alignment to the stack alignment, which in practice
(and as a bonus) also coincides with the alignment "natural" vectors get.

llvm-svn: 331451
2018-05-03 12:54:25 +00:00
Clement Courbet 6794660828 [TableGen][NFC] Make ResourceCycles definitions more explicit.
https://reviews.llvm.org/D46356

llvm-svn: 331439
2018-05-03 06:08:47 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Nico Weber 432a38838d IWYU for llvm-config.h in llvm, additions.
See r331124 for how I made a list of files missing the include.
I then ran this Python script:

    for f in open('filelist.txt'):
        f = f.strip()
        fl = open(f).readlines()

        found = False
        for i in xrange(len(fl)):
            p = '#include "llvm/'
            if not fl[i].startswith(p):
                continue
            if fl[i][len(p):] > 'Config':
                fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
                found = True
                break
        if not found:
            print 'not found', f
        else:
            open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184
2018-04-30 14:59:11 +00:00
Oliver Stannard f3632143da [ARM] Codegen for v8.2A dot product intrinsics
This adds IR intrinsics for the ARM dot-product instructions introduced in
v8.2-A.

Differential revision: https://reviews.llvm.org/D46106

llvm-svn: 331032
2018-04-27 12:50:40 +00:00
David Green c4cccea4c9 [ARM] Enable misched for R52.
Back when the R52 schedule was added in rL286949, there was no way
to enable machine schedules in ARM for specific cores. Since then a
target feature has been added. This enables the feature for R52,
removing the need to manually specify compiler flags.

llvm-svn: 331027
2018-04-27 11:29:49 +00:00
Nico Weber 77c5471d9f List cpp file only once (was added in 147117 and 147117 as build fix each).
llvm-svn: 330587
2018-04-23 13:11:51 +00:00
Nico Weber 5d53aed419 Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt
llvm-svn: 330584
2018-04-23 12:49:34 +00:00
Tim Northover 271d3d2771 MachO: trap unreachable instructions
Debugability is more important than saving 4 bytes to let us to fall
through to nonense.

llvm-svn: 330073
2018-04-13 22:25:20 +00:00
Sjoerd Meijer 834f7dc7ab [ARM] FP16 vmaxnm/vminnm scalar instructions
This adds code generation support for the FP16 vmaxnm/vminnm scalar
instructions.

Differential Revision: https://reviews.llvm.org/D44675

llvm-svn: 330034
2018-04-13 15:34:26 +00:00
Ivan A. Kosarev f533a6e5aa [NEON] Support intrinsic for scalar and vector versions of the VRINTN instruction
Differential Revision: https://reviews.llvm.org/D45514

llvm-svn: 330011
2018-04-13 12:45:12 +00:00
Sjoerd Meijer ac96d7c4b3 [ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.

More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.

Differential Revision: https://reviews.llvm.org/D45205

llvm-svn: 329788
2018-04-11 09:28:04 +00:00
Hiroshi Inoue 9ff2380ea6 [NFC] fix trivial typos in comments and error message
"is is" -> "is", "are are" -> "are"

llvm-svn: 329546
2018-04-09 04:37:53 +00:00
Tim Northover e25e458d52 Reapply ARM: Do not spill CSR to stack on entry to noreturn functions
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.

A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.

Should fix PR9970.

Patch mostly by myeisha (pmb).

llvm-svn: 329494
2018-04-07 10:57:03 +00:00
Vitaly Buka de5f196530 Revert "ARM: Do not spill CSR to stack on entry to noreturn functions"
Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM

This reverts commit r329287

llvm-svn: 329486
2018-04-07 05:36:44 +00:00
Mandeep Singh Grang 9893fe218c [ARM] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: t.p.northover, RKSimon, MatzeB, bkramer

Reviewed By: bkramer

Subscribers: javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D44855

llvm-svn: 329329
2018-04-05 18:31:50 +00:00
Tim Northover b30388bf11 ARM: Do not spill CSR to stack on entry to noreturn functions
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.

Should fix PR9970.

Patch by myeisha (pmb).

llvm-svn: 329287
2018-04-05 14:26:06 +00:00
Nico Weber 1cbd096914 Sort targetgen calls in lib/Target/*/CMakeLists.
Makes it easier to see mistakes such as the one fixed in r329178 and makes
the different target CMakeLists more consistent.

Also remove some stale-looking comments from the Nios2 target cmakefile.

No intended behavior change.

llvm-svn: 329181
2018-04-04 12:37:44 +00:00
Mikhail Maltsev 68f35bcc85 [ARM] Do not convert some vmov instructions
Summary:
Patch https://reviews.llvm.org/D44467 implements conversion of invalid
vmov instructions into valid ones. It turned out that some valid
instructions also get converted, for example

  vmov.i64 d2, #0xff00ff00ff00ff00 ->
  vmov.i16 d2, #0xff00

Such behavior is incorrect because according to the ARM ARM section
F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD
instructions, "On assembly, the data type must be matched in the table
if possible."

This patch fixes the isNEONmovReplicate check so that the above
instruction is not modified any more.

Reviewers: rengolin, olista01

Reviewed By: rengolin

Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D44678

llvm-svn: 329158
2018-04-04 08:54:19 +00:00
David Blaikie bd0c88078a Remove some unneeded #includes to fix layering
llvm-svn: 328838
2018-03-29 22:31:36 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
Christof Douma a1e77c0e02 [ARM] Support float literals under XO
Follow up patch of r328313 to support the UseVMOVSR constraint. Removed
some unneeded instructions from the test and removed some stray
comments.

Differential Revision: https://reviews.llvm.org/D44941

llvm-svn: 328691
2018-03-28 10:02:26 +00:00
Martin Storsjo 439824622a [ARM] Simplify constructing the ARMArchFeature string. NFC.
Differential Revision: https://reviews.llvm.org/D44819

llvm-svn: 328478
2018-03-26 08:41:10 +00:00
Simon Pilgrim 351e4fa0e2 [ARM] Remove sched model instregex entries that don't match any instructions (D44687)
Reviewed by @javed.absar

llvm-svn: 328457
2018-03-25 19:07:17 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Ana Pazos 41573804f2 [ARM] Fix "Constant pool entry out of range!" in Thumb1 mode
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.

In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch  instruction.

Reviewers: t.p.northover, eli.friedman

Reviewed By: t.p.northover

Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D44709

llvm-svn: 328341
2018-03-23 17:53:27 +00:00
Christof Douma 4a025cc79d [ARM] Support float literals under XO
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.

This patch also restores using fpcmp with immediate 0 under fp-armv8.

Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
2018-03-23 13:02:03 +00:00
Martin Storsjo e1a64fe95c [ARM] Error out on .arm assembler directives on windows
Windows on arm is thumb only.

Differential Revision: https://reviews.llvm.org/D43005

llvm-svn: 328298
2018-03-23 09:10:03 +00:00
Craig Topper 7ccb5ebed8 [ARM] Enable the full InstRW overlap check for ARMScheduleR52.td
This fixes a few issues with the R52 instregexs to enable the full overlap checking

Differential Revision: https://reviews.llvm.org/D44767

llvm-svn: 328216
2018-03-22 17:17:47 +00:00
Nirav Dave 3264c1bdf6 [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"
Reland ISel cycle checking improvements after simplifying node id
invariant traversal and correcting typo.

llvm-svn: 327898
2018-03-19 20:19:46 +00:00
Martin Storsjo 9a55c1b0dc [ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes
This extends the use of this attribute on ARM and AArch64 from
SVN r325900 (where it was only checked for fixed stack
allocations on ARM/AArch64, but for all stack allocations on X86).

This also adds a testcase for the existing use of disabling the
fixed stack probe with the attribute on ARM and AArch64.

Differential Revision: https://reviews.llvm.org/D44291

llvm-svn: 327897
2018-03-19 20:06:50 +00:00
Sjoerd Meijer d16037d9bb [ARM] Support for v4f16 and v8f16 vectors
This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which
uses v4f16 and v8f16 vector operands and return values. All the moving parts
are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16
intrinsic. In a follow-up patch the rest of the intrinsics and tests will be
added.

Differential Revision: https://reviews.llvm.org/D44538

llvm-svn: 327839
2018-03-19 13:35:25 +00:00
Mikhail Maltsev f07278ec31 [ARM] Fix warnings about missing parentheses in ARMAsmParser
llvm-svn: 327827
2018-03-19 09:48:58 +00:00
Craig Topper e1d6a4df1c [TableGen] When trying to reuse a scheduler class for instructions from an InstRW, make sure we haven't already seen another InstRW containing this instruction on this CPU.
This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check.

So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag.

A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found.

llvm-svn: 327808
2018-03-18 19:56:15 +00:00
Nirav Dave 5f0ab71b62 Revert "[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172""
as it times out building test-suite on PPC.

llvm-svn: 327778
2018-03-17 19:24:54 +00:00
Nirav Dave 982d3a56ea [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"
Reland ISel cycle checking improvements after simplifying and reducing
node id invariant traversal.

llvm-svn: 327777
2018-03-17 17:42:10 +00:00
Mikhail Maltsev ed1c8bfec2 [ARM] Convert more invalid NEON immediate loads
Summary:
Currently the LLVM MC assembler is able to convert e.g.

  vmov.i32 d0, #0xabababab

(which is technically invalid) into a valid instruction

  vmov.i8 d0, #0xab

this patch adds support for vmov.i64 and for cases with the resulting
load types other than i8, e.g.:

  vmov.i32 d0, #0xab00ab00 ->
  vmov.i16 d0, #0xab00

Reviewers: olista01, rengolin

Reviewed By: rengolin

Subscribers: rengolin, javed.absar, kristof.beyls, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D44467

llvm-svn: 327709
2018-03-16 14:10:56 +00:00
Mikhail Maltsev 8dcf6fa308 [ARM] Fix a check in vmov/vmvn immediate parsing
Summary:
Currently the check is incorrect and the following invalid
instruction is accepted and incorrectly assembled:

  vmov.i32        d2, #0x00a500a6

This patch fixes the issue.

Reviewers: olista01, rengolin

Reviewed By: rengolin

Subscribers: SjoerdMeijer, javed.absar, rogfer01, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D44460

llvm-svn: 327704
2018-03-16 12:46:49 +00:00
Sjoerd Meijer d391a1a985 [ARM] FP16 codegen support for VSEL
This implements lowering of SELECT_CC for f16s, which enables
codegen of VSEL with f16 types.

Differential Revision: https://reviews.llvm.org/D44518

llvm-svn: 327695
2018-03-16 08:06:25 +00:00
Matt Arsenault 41e5ac4fa4 TargetMachine: Add address space to getPointerSize
llvm-svn: 327467
2018-03-14 00:36:23 +00:00
Nirav Dave 042678bd55 Revert: r327172 "Correct load-op-store cycle detection analysis"
r327171 "Improve Dependency analysis when doing multi-node Instruction Selection"
        r328170 "[DAG] Enforce stricter NodeId invariant during Instruction selection"

Reverting patch as NodeId invariant change is causing pathological
increases in compile time on PPC

llvm-svn: 327197
2018-03-10 02:16:15 +00:00
Nirav Dave 071699bf82 [DAG] Enforce stricter NodeId invariant during Instruction selection
Instruction Selection makes use of the topological ordering of nodes
by node id (a node's operands have smaller node id than it) when doing
cycle detection.  During selection we may violate this property as a
selection of multiple nodes may induce a use dependence (and thus a
node id restriction) between two unrelated nodes. If a selected node
has an unselected successor this may allow us to miss a cycle in
detection an invalid selection.

This patch fixes this by marking all unselected successors of a
selected node have negated node id.  We avoid pruning on such negative
ids but still can reconstruct the original id for pruning.

In-tree targets have been updated to replace DAG-level replacements
with ISel-level ones which enforce this property.

This preemptively fixes PR36312 before triggering commit r324359 relands

Reviewers: craig.topper, bogner, jyknight

Subscribers: arsenm, nhaehnle, javed.absar, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D43198

llvm-svn: 327170
2018-03-09 20:57:15 +00:00
Sjoerd Meijer af30f06d5c [ARM] Fix for PR36577
Don't PerformSHLSimplify if the given node is used by a node that also uses a
constant because we may get stuck in an infinite combine loop.

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577

Patch by Sam Parker.

Differential Revision: https://reviews.llvm.org/D44097

llvm-svn: 326882
2018-03-07 09:10:44 +00:00
Simi Pallipurath 75c6bfeac9 [ARM]Decoding MSR with unpredictable destination register causes an assert
This patch handling:

    Enable parsing of raw encodings of system registers .
    Allows UNPREDICTABLE sysregs to be decoded to a raw number in the same way that disasslib does, rather than llvm crashing.
    Disassemble msr/mrs with unpredictable sysregs as SoftFail.
    Fix regression due to SoftFailing some encodings.

Patch by Chris Ryder

Differential revision:https://reviews.llvm.org/D43374

llvm-svn: 326803
2018-03-06 15:21:19 +00:00
Oliver Stannard f20222a83c [ARM][Asm] VMOVSRR and VMOVRRS need sequential S registers
These instructions require that the two S registers are adjacent (but not the R
registers), because only the first register is included in the encoding, but we
were not checking this in the assembler.

Differential revision: https://reviews.llvm.org/D44084

llvm-svn: 326696
2018-03-05 13:27:26 +00:00
Thomas Preud'homme c699eaa311 Fix location of comment in EmitPopInst
Comment about folding return in LDM was not moved along with the
corresponding code in r242714. This commit fixes that.

llvm-svn: 326690
2018-03-05 11:49:00 +00:00
Benjamin Kramer 4925653555 [ARM] Fold variable into assert.
Avoids unused variable warnings in Release mode.

llvm-svn: 326592
2018-03-02 17:39:20 +00:00
Momchil Velikov 505614bb4f [ARM] Fix access to stack arguments when re-aligning SP in Armv6m
When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.

This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment.  It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.

All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`

Differential Revision: https://reviews.llvm.org/D43566

llvm-svn: 326584
2018-03-02 15:47:14 +00:00
Florian Hahn 9deef20b6c [ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WB
Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:

1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
   rules to expand them to non pseudo MIR. These are selected for
   ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.

2) Selection of the right VLD/VST instruction is broken for load and
   store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
   with MIR opcode for fixed writeback (ie increment is access size)
   and call getVLDSTRegisterUpdateOpcode() to select an opcode with
   register writeback if base register update is of a different size.
   Since getVLDSTRegisterUpdateOpcode() only knows about
   VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
   of element in the vector.

   However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
   load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
   updated which later lead to a fixed writeback instruction being
   constructed with an extra operand for the register writeback.

This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
  VLD1d64QPseudoWB_register to VLD1d64Twb_register and
  VLD1d64Qwb_register respectively. Like for the existing _fixed
  variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
  and isVSTfixed respectively to decide whether the opcode should be
  updated. It also reworks the logic and comments for pushing the
  writeback offset operand and r0 operand to clarify the logic:
  writeback offset needs to be pushed if it's a register writeback,
  r0 needs to be pushed if not and the instruction is a
  VLD1/VLD2/VST1/VST2.

Reviewers: rengolin, t.p.northover, samparker

Reviewed By: samparker

Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>

Differential Revision: https://reviews.llvm.org/D42970

llvm-svn: 326570
2018-03-02 13:02:55 +00:00
Chih-Hung Hsieh 9f9e4681ac [TLS] use emulated TLS if the target supports only this mode
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.

Differential Revision: https://reviews.llvm.org/D42999

llvm-svn: 326341
2018-02-28 17:48:55 +00:00
Pablo Barrio 512f7ee315 [ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations
Summary:
Expressions of the form x < 0 ? 0 :  x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves

In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.

Patch by Martin Svanfeldt

Reviewers: fhahn, pbarrio, rogfer01

Reviewed By: pbarrio, rogfer01

Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42574

llvm-svn: 326333
2018-02-28 17:13:07 +00:00
Andrew Zhogin f8e88af11d [ARM] Cortex-A57 scheduler fix for ARM backend (missed 16-bit, v8.1/v8.2/v8.3, thumb and pseudo instructions)
Added missed scheduling info for ARM Cortex A57 (AArch32) to have CompleteModel with this checkCompleteness fix: https://reviews.llvm.org/D43235.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D43808

llvm-svn: 326304
2018-02-28 05:53:18 +00:00
Sjoerd Meijer fc0d02cbbf [ARM] Another f16 litpool fix
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause 
reducing the block alignment when it already had been aligned (e.g. 
in Thumb mode when the block is a CPE that was already 4-byte aligned).

Patch by Momchil Velikov, I've only added a test.

Differential Revision: https://reviews.llvm.org/D43777

llvm-svn: 326232
2018-02-27 19:26:02 +00:00
Peter Collingbourne e8436e8631 ARM: Don't rewrite add reg, $sp, 0 -> mov reg, $sp if the add defines CPSR.
Differential Revision: https://reviews.llvm.org/D43807

llvm-svn: 326226
2018-02-27 19:00:59 +00:00
Aditya Nandakumar 599990530e [GISel]: Don't assert when constraining RegisterOperands which are uses.
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.

https://reviews.llvm.org/D43409

llvm-svn: 326142
2018-02-26 22:56:21 +00:00
Geoff Berry f8bf2ec0a8 [MachineOperand][Target] MachineOperand::isRenamable semantics changes
Summary:
Add a target option AllowRegisterRenaming that is used to opt in to
post-register-allocation renaming of registers.  This is set to 0 by
default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq
fields of all opcodes to be set to 1, causing
MachineOperand::isRenamable to always return false.

Set the AllowRegisterRenaming flag to 1 for all in-tree targets that
have lit tests that were effected by enabling COPY forwarding in
MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC,
RISCV, Sparc, SystemZ and X86).

Add some more comments describing the semantics of the
MachineOperand::isRenamable function and how it is set and maintained.

Change isRenamable to check the operand's opcode
hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of
relying on it being consistently reflected in the IsRenamable bit
setting.

Clear the IsRenamable bit when changing an operand's register value.

Remove target code that was clearing the IsRenamable bit when changing
registers/opcodes now that this is done conservatively by default.

Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in
one place covering all opcodes that have constant pipe read limit
restrictions.

Reviewers: qcolombet, MatzeB

Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D43042

llvm-svn: 325931
2018-02-23 18:25:08 +00:00
Hans Wennborg 89c35fc44d Support for the mno-stack-arg-probe flag
Adds support for this flag. There is also another piece for clang
(separate review). More info:
https://bugs.llvm.org/show_bug.cgi?id=36221

By Ruslan Nikolaev!

Differential Revision: https://reviews.llvm.org/D43107

llvm-svn: 325900
2018-02-23 13:46:25 +00:00
Sjoerd Meijer d31a8c0595 Recommit: [ARM] f16 constant pool fix
This recommits r325754; the modified and failing test case
actually didn't need any modifications.

llvm-svn: 325765
2018-02-22 10:43:57 +00:00
David Green 01e0f25a9f [ARM] Fix issue with large xor constants.
Fixup to rL325573 for large xor constants.

Thanks to Eli Friedman for the catch.

Differential revision: https://reviews.llvm.org/D43549

llvm-svn: 325761
2018-02-22 09:38:57 +00:00
Sjoerd Meijer 9a25247f80 Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy.
llvm-svn: 325756
2018-02-22 08:41:55 +00:00
Sjoerd Meijer 7d5909eb0f [ARM] f16 constant pool fix
This is a follow up of r325012, that allowed half types in constant pools.
Proper alignment was enforced when a big basic block was split up, but not when
a CPE was placed before/after a block; the successor block had the wrong
alignment.

Differential Revision: https://reviews.llvm.org/D43580

llvm-svn: 325754
2018-02-22 08:16:05 +00:00
Hiroshi Inoue 7f9f92f8b6 [NFC] fix trivial typos in comments
"a a" -> "a"

llvm-svn: 325752
2018-02-22 07:48:29 +00:00
Sjoerd Meijer 4d5c40492a [ARM] Lower BR_CC for f16
This case wasn't handled yet.

Differential Revision: https://reviews.llvm.org/D43508

llvm-svn: 325616
2018-02-20 19:28:05 +00:00
David Green 056476497e [ARM] Mark -1 as cheap in xor's for thumb1
We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.

Differential Revision: https://reviews.llvm.org/D43451

llvm-svn: 325573
2018-02-20 11:07:35 +00:00
Jonas Paulsson 995ba6e42c [ARM] Return true in enableMultipleCopyHints().
Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Eli Friedman
llvm-svn: 325327
2018-02-16 09:51:01 +00:00
Roger Ferrer Ibanez d41059a9f6 [ARM] Materialise some boolean values to avoid a branch
This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form

  a != b ? x : 0
  a == b ? 0 : x

and that currently (e.g. in Thumb1) are emitted as branches.

Differential Revision: https://reviews.llvm.org/D34515

llvm-svn: 325323
2018-02-16 09:23:59 +00:00
Pablo Barrio e28cb8399a [ARM] Allow 64- and 128-bit types with 't' inline asm constraint
Summary:
In LLVM, 't' selects a floating-point/SIMD register and only supports
32-bit values. This is appropriately documented in the LLVM Language
Reference Manual. However, this behaviour diverges from that of GCC, where
't' selects the s0-s31 registers and its qX and dX variants depending on
additional operand modifiers (q/P).

For example, the following C code:

#include <arm_neon.h>
float32x4_t a, b, x;
asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b))

results in the following assembly if compiled with GCC:

vadd.f32 s0, s0, s1

whereas LLVM will show "error: couldn't allocate output register for
constraint 't'", since a, b, x are 128-bit variables, not 32-bit.

This patch extends the use of 't' to mean that of GCC, thus allowing
selection of the lower Q vector regs and their D/S variants. For example,
the earlier code will now compile as:

vadd.f32 q0, q0, q1

This behaviour still differs from that of GCC but I think it is actually
more correct, since LLVM picks up the right register type based on the
datatype of x, while GCC would need an extra operand modifier to achieve
the same result, as follows:

asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b))

Since this is only an extension of functionality, existing code should not
be affected by this change. Note that operand modifiers q/P are already
supported by LLVM, so this patch should suffice to support inline
assembly with constraint 't' originally built for GCC.

Reviewers: grosbach, rengolin

Reviewed By: rengolin

Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42962

llvm-svn: 325244
2018-02-15 14:44:22 +00:00
Sjoerd Meijer 9430c8cd1c [ARM] f16 vcmp fixes
This adds f16 VCMP match rules and fixes the test cases.

Differential Revision: https://reviews.llvm.org/D43291

llvm-svn: 325228
2018-02-15 10:33:07 +00:00
Sjoerd Meijer 3b4294edd2 [ARM] f16 stack spill/reloads
This adds support for handling f16 stack spills/reloads.

Differential Revision: https://reviews.llvm.org/D43280

llvm-svn: 325130
2018-02-14 15:09:09 +00:00
Sjoerd Meijer f4a7fa7bbe [ARM] Allow half types in ConstantPool
Change ARMConstantIslandPass to:
- accept f16 literals as litpool entries,
- if the litpool needs to be inserted in the middle of a big block, then we
  need to 4-byte align the next instruction in ARM mode.

Differential Revision: https://reviews.llvm.org/D42784

llvm-svn: 325012
2018-02-13 15:34:09 +00:00
Andre Vieira f00234c0bf [ARM] Don't print "Requires NEON" error message for M-profile
Differential Revision: https://reviews.llvm.org/D43125

llvm-svn: 325000
2018-02-13 11:46:38 +00:00
Sjoerd Meijer 101ee43072 [Thumb] Handle addressing mode AddrMode5FP16
This addressing mode wasn't checked, so we were running in an assert.

Differential Revision: https://reviews.llvm.org/D43179

llvm-svn: 324996
2018-02-13 10:29:03 +00:00
Daniel Neilson 7512c3e15f [ARMFastISel] Replace deprecated calls to MemoryIntrinsic::getAlignment() (NFCI)
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes
ARMFastISel to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference

http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324781
2018-02-09 23:31:37 +00:00
Oliver Stannard 133b6085e8 [ARM] Re-commit r324600 with fixed LLVMBuild.txt
ARMDisassembler now depends on the banked register tables in ARMUtils, so the
LLVMBuild.txt needed updating to reflect this.

Original commit mesage:

[ARM] Fix disassembly of invalid banked register moves

When disassembling banked register move instructions, we don't have an
assembly syntax for the unallocated register numbers, so we have to
return Fail rather than SoftFail. Previously we were returning SoftFail,
then crashing in the InstPrinter as we have no way to represent these
encodings in an assembly string.

This also switches the decoder to use the table-generated list of banked
registers, removing the duplicated list of encodings.

Differential revision: https://reviews.llvm.org/D43066

llvm-svn: 324606
2018-02-08 14:31:22 +00:00
Oliver Stannard 3c11ecbbab Revert r324600 as it breaks a buildbot
The broken bot (clang-ppc64le-linux-multistage) is doign a shared-object build,
so I guess using lookupBankedRegByEncoding in the disassembler is a layering
violation?

llvm-svn: 324604
2018-02-08 14:21:28 +00:00
Oliver Stannard db982b25ff [ARM] Fix disassembly of invalid banked register moves
When disassembling banked register move instructions, we don't have an
assembly syntax for the unallocated register numbers, so we have to
return Fail rather than SoftFail. Previously we were returning SoftFail,
then crashing in the InstPrinter as we have no way to represent these
encodings in an assembly string.

This also switches the decoder to use the table-generated list of banked
registers, removing the duplicated list of encodings.

Differential revision: https://reviews.llvm.org/D43066

llvm-svn: 324600
2018-02-08 13:06:08 +00:00
Peter Collingbourne 559ff1fe03 ARM: Remove dead code. NFCI.
llvm-svn: 324565
2018-02-08 05:28:39 +00:00
Sjoerd Meijer 8c0739347c [ARM] FP16 mov imm pattern
This is a follow up of r324321, adding a match pattern for mov with a FP16
immediate (also fixing operand vfp_f16imm that wasn't even compiling).

Differential Revision: https://reviews.llvm.org/D42973

llvm-svn: 324456
2018-02-07 08:37:17 +00:00
Sjoerd Meijer d2718ba95e [ARM] f16 conversions
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.

Differential Revision: https://reviews.llvm.org/D42954

llvm-svn: 324360
2018-02-06 16:28:43 +00:00
Oliver Stannard ee0ac39305 [ARM][AArch64] Add CSDB speculation barrier instruction
This adds the CSDB instruction, which is a new barrier instruction
described by the whitepaper at [1].

This is in encoding space which was previously executed as a NOP, so it is
available for all targets that have the relevant NOP encoding space. This
matches the binutils behaviour for these instructions [2][3].

[1] https://developer.arm.com/support/security-update
[2] https://sourceware.org/ml/binutils/2018-01/msg00116.html
[3] https://sourceware.org/ml/binutils/2018-01/msg00120.html

llvm-svn: 324324
2018-02-06 09:24:47 +00:00
Sjoerd Meijer 89ea2648bb [ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:

- FP16 literals and immediates are not properly supported yet (e.g. literal
  pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
  added.

This will be addressed in follow-up patches.

Differential Revision: https://reviews.llvm.org/D42849

llvm-svn: 324321
2018-02-06 08:43:56 +00:00
Sjoerd Meijer 9d9a86535e [ARM] FullFP16 LowerReturn Fix
Commit r323512 introduced an optimisation in LowerReturn for half-precision
return values. A missing check caused a crash when the return value is "undef"
(i.e. a node that has no operands).

Differential Revision: https://reviews.llvm.org/D42743

llvm-svn: 323968
2018-02-01 13:48:40 +00:00
Yvan Roux 490e9e6761 [ARM] Add support for unpredictable MVN instructions.
This fixes bugzilla 33011
https://bugs.llvm.org/show_bug.cgi?id=33011

Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in
sections A8.8.116 and A8.8.117.

It fixes also the usage of PC register as destination register for MVN
register-shifted register version as specified in A8.8.117.

Differential Revision: https://reviews.llvm.org/D41905

llvm-svn: 323954
2018-02-01 12:06:57 +00:00
Yvan Roux 705e26a243 Test commit: Fix a comment.
llvm-svn: 323947
2018-02-01 08:39:58 +00:00
Evgeniy Stepanov 7746899f48 Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations"
Miscompiles code. Testcase pending.

This reverts commit r323869.

llvm-svn: 323929
2018-01-31 22:55:19 +00:00
Diana Picus 12ed95e3e7 Fix formatting for r323876. NFC
llvm-svn: 323878
2018-01-31 15:16:17 +00:00
Diana Picus 1d4421f6a6 [ARM GlobalISel] Modernize LegalizerInfo. NFCI
Start using the new LegalizerInfo API introduced in r323681.

Keep the old API for opcodes that need Lowering in some circumstances
(G_FNEG and G_UREM/G_SREM).

llvm-svn: 323876
2018-01-31 14:55:07 +00:00
Pablo Barrio 2e442a7831 [ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations
Summary:
Expressions of the form x < 0 ? 0 :  x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves

In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.

Patch by Marten Svanfeldt.

Reviewers: fhahn, pbarrio

Reviewed By: pbarrio

Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42574

llvm-svn: 323869
2018-01-31 13:20:10 +00:00
Sjoerd Meijer 98d5359ea2 [ARM] Armv8.2-A FP16 code generation (part 2/3)
Half-precision arguments and return values are passed as if it were an int or
float for ARM. This results in truncates and bitcasts to/from i16 and f16
values, which are legalized very early to stack stores/loads. When FullFP16 is
enabled, we want to avoid codegen for these bitcasts as it is unnecessary and
inefficient.

Differential Revision: https://reviews.llvm.org/D42580

llvm-svn: 323861
2018-01-31 10:18:29 +00:00
Roger Ferrer Ibanez aea4208720 [ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR.
In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do
copies CPSR ↔ GPR but not all Thumb1 targets implement them.

The schedule can attempt, before attempting a copy, to clone the instructions
but it does not currently do that for nodes with input glue. In this patch we
introduce a target-hook to let the hook decide if a glued machinenode is still
eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS .

As a follow-up of this change we should actually implement the copies for the
Thumb1 targets that do implement them and restrict the hook to the targets that
can't really do such copy as these clones are not ideal.

This change fixes PR35836.

Differential Revision: https://reviews.llvm.org/D42051

llvm-svn: 323857
2018-01-31 09:23:43 +00:00
Diana Picus 2a5b962030 [ARM GlobalISel] Map G_SITOFP and G_UITOFP
Straightforward mapping (integer operand to GPR, floating point operand
to FPR).

llvm-svn: 323731
2018-01-30 09:15:23 +00:00
Diana Picus 517531e5a5 [ARM GlobalISel] Legalize G_SITOFP and G_UITOFP
Legal if we have hardware support, libcall otherwise.

Also add supporting code to the legalizer helper for libcalls.

llvm-svn: 323730
2018-01-30 09:15:17 +00:00
Diana Picus a2da03022c [ARM GlobalISel] Map G_FPTOSI and G_FPTOUI
Straightforward mapping (integer operand goes to GPR, floating point
operand goes to FPR).

llvm-svn: 323727
2018-01-30 07:54:58 +00:00
Diana Picus 4ed0ee7b5f [ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI
Legal if we have hardware support for floating point, libcalls
otherwise.

Also add the necessary support for libcalls in the legalizer helper.

llvm-svn: 323726
2018-01-30 07:54:52 +00:00
Daniel Sanders 08464524c3 [ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern
Summary:
Apparently, we missed on constraining register classes of VReg-operands of all the instructions
built from a destination pattern but the root (top-level) one. The issue exposed itself
while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped
into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained,
while nested VTOSIZS (or rather its destination virtual register to be exact) does not.

Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI.

https://bugs.llvm.org/show_bug.cgi?id=35965
rdar://problem/36886530

Patch by Roman Tereshin

Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan

Reviewed By: dsanders, qcolombet, rovka

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42565

llvm-svn: 323692
2018-01-29 21:09:12 +00:00
Daniel Sanders 9ade5592d9 [globalisel] Make LegalizerInfo::LegalizeAction available outside of LegalizerInfo. NFC
Summary:
The improvements to the LegalizerInfo discussed in D42244 require that
LegalizerInfo::LegalizeAction be available for use in other classes. As such,
it needs to be moved out of LegalizerInfo. This has been done separately to the
next patch to minimize the noise in that patch.

llvm-svn: 323669
2018-01-29 17:37:29 +00:00
Sjoerd Meijer 3ddb7fb663 [ARM] FP16Pat and FullFP16Pat patterns. NFC.
Create and use FP16Pat FullFP16Pat helper patterns to make the difference
explicit.

Differential Revision: https://reviews.llvm.org/D42634

llvm-svn: 323640
2018-01-29 11:28:06 +00:00
Momchil Velikov d2cc6fd90b [ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative
load instruction

The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR`
register class. The function serves to emit a `tLDRspi` instruction and
certainly any subset of the `tGPR` register class is a valid destination of the
load.

Differential revision: https://reviews.llvm.org/D42535

llvm-svn: 323514
2018-01-26 10:20:58 +00:00
Sjoerd Meijer 011de9c0ca [ARM] Armv8.2-A FP16 code generation (part 1/3)
This is the groundwork for Armv8.2-A FP16 code generation .

Clang passes and returns _Float16 values as floats, together with the required
bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318.
We will implement half-precision argument passing/returning lowering in the ARM
backend soon, but for now this means that this:

_Float16 sub(_Float16 a, _Float16 b) {
  return a + b;
}

gets lowered to this:

define float @sub(float %a.coerce, float %b.coerce) {
entry:
  %0 = bitcast float %a.coerce to i32
  %tmp.0.extract.trunc = trunc i32 %0 to i16
  %1 = bitcast i16 %tmp.0.extract.trunc to half
  <SNIP>
  %add = fadd half %1, %3
  <SNIP>
}

When FullFP16 is *not* supported, we don't make f16 a legal type, and we get
legalization for "free", i.e. nothing changes and everything works as before.
And also f16 argument passing/returning is handled.

When FullFP16 is supported, we do make f16 a legal type, and have 2 places that
we need to patch up: f16 argument passing and returning, which involves minor
tweaks to avoid unnecessary code generation for some bitcasts.

As a "demonstrator" that this works for the different FP16, FullFP16, softfp
modes, etc., I've added match rules to the VSUB instruction description showing
that we can codegen this instruction from IR, but more importantly, also to
some conversion instructions. These conversions were causing issue before in
the FP16 and FullFP16 cases.

I've also added match rules to the VLDRH and VSTRH desriptions, so that we can
actually compile the entire half-precision sub code example above. This showed
that these loads and stores had the wrong addressing mode specified: AddrMode5
instead of AddrMode5FP16, which turned out not be implemented at all, so that
has also been added.

This is the minimal patch that shows all the different moving parts. In patch
2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the
remaining Armv8.2-A FP16 instruction descriptions.


Thanks to Sam Parker and Oliver Stannard for their help and reviews!


Differential Revision: https://reviews.llvm.org/D38315

llvm-svn: 323512
2018-01-26 09:26:40 +00:00
Weiming Zhao 665784f170 [ARM] Expand long shifts for Thumb1 to __aeabi_ calls
Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1.

Reviewers: samparker

Reviewed By: samparker

Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42401

llvm-svn: 323354
2018-01-24 18:00:57 +00:00
Martin Storsjo 4ed94a06ac [ARM] Call __chkstk for dynamic stack allocation in all windows environments
This matches what MSVC does for alloca() function calls on ARM.
Even if MSVC doesn't support VLAs at the language level, it does
support the alloca function.

On the clang level, both the _alloca() (when emulating MSVC, which is
what the alloca() function expands to) and __builtin_alloca() builtin
functions, and VLAs, map to the same LLVM IR "alloca" function - so
within LLVM they're not distinguishable from each other.

Differential Revision: https://reviews.llvm.org/D42292

llvm-svn: 323308
2018-01-24 06:40:11 +00:00
Joel Galenson 1d89cd2bb4 [ARM] Cleanup part of ARMBaseInstrInfo::optimizeCompareInstr (NFCI).
As noted in another review, this loop is confusing.  This commit cleans it up
somewhat.

Differential Revision: https://reviews.llvm.org/D42312

llvm-svn: 323136
2018-01-22 17:53:47 +00:00
Marina Yatsina 0bf841ac2a Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files.
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40333

Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56
llvm-svn: 323095
2018-01-22 10:06:50 +00:00
Marina Yatsina 3d8efa4f0c Rename ExecutionDepsFix files to ExecutionDomainFix
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40332

Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8
llvm-svn: 323094
2018-01-22 10:06:33 +00:00
Marina Yatsina 6fc2aaae8d Separate ExecutionDepsFix into 4 parts:
1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains).
2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings.
3. BreakFalseDeps - Breaks false dependencies.
4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks.

This also included the following changes to ExcecutionDepsFix original logic:
1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class.
2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class.

Additional changes in affected files:
1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate.
2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate.

Additional refactoring changes will follow.

This commit is (almost) NFC.
The only functional change is that now BreakFalseDeps will break dependency for all register classes.
Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet.
In a future commit several instructions (and tests) will be added.

This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40330

Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275
llvm-svn: 323087
2018-01-22 10:05:23 +00:00
Joel Galenson dbc724f764 [ARM] Fix perf regression in compare optimization.
Fix a performance regression caused by r322737.

While trying to make it easier to replace compares with existing adds and
subtracts, I accidentally stopped it from doing so in some cases.  This should
fix that.  I'm also fixing another potential bug in that commit.

Differential Revision: https://reviews.llvm.org/D42263

llvm-svn: 322972
2018-01-19 17:46:27 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Reid Kleckner 1aa9061c5f [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64
Every known PE COFF target emits /EXPORT: linker flags into a .drective
section. The AsmPrinter should handle this.

While we're at it, use global_values() and emit each export flag with
its own .ascii directive. This should make the .s file output more
readable.

llvm-svn: 322788
2018-01-17 23:55:23 +00:00
Joel Galenson bbcaf4ac5c [ARM] Optimize {s,u}mul.with.overflow.
This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG.

Differential revision: https://reviews.llvm.org/D40922

llvm-svn: 322738
2018-01-17 19:19:05 +00:00
Joel Galenson fe7fa40869 [ARM] Optimize {s,u}{add,sub}.with.overflow.
The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other).

Differential revision: https://reviews.llvm.org/D38378

llvm-svn: 322737
2018-01-17 19:19:05 +00:00
Diana Picus 01bcfd2112 [ARM GlobalISel] Rename local variable. NFC
llvm-svn: 322667
2018-01-17 15:25:37 +00:00
Diana Picus c62a16234b [ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR
llvm-svn: 322657
2018-01-17 14:14:14 +00:00
Diana Picus 65ed364fac [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC
Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware
support, but only for conversions between float and double.

Also add the necessary boilerplate so that the LegalizerHelper can
introduce the required libcalls. This also works only for float and
double, but isn't too difficult to extend when the need arises.

llvm-svn: 322651
2018-01-17 13:34:10 +00:00
Diana Picus 2dc5405693 [ARM GlobalISel] Map G_FMA to FPR
llvm-svn: 322367
2018-01-12 12:06:01 +00:00
Diana Picus e74243d473 [ARM GlobalISel] Legalize G_FMA
For hard float with VFP4, it is legal. Otherwise, we use libcalls.

This needs a bit of support in the LegalizerHelper for soft float
because we didn't handle G_FMA libcalls yet. The support is trivial, as
the only difference between G_FMA and other libcalls that we already
handle is that it has 3 input operands rather than just 2.

llvm-svn: 322366
2018-01-12 11:30:45 +00:00
Andre Vieira 5627c218e1 [ARM] Add codegen for SMMULR, SMMLAR and SMMLSR
This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR
instructions from equivalent IR patterns.

Differential Revision: https://reviews.llvm.org/D41775

llvm-svn: 322361
2018-01-12 09:24:41 +00:00
Andre Vieira 26b9de9ebb [ARM] Fix erroneous availability of SMMLS for Armv7-M
Differential Revision: https://reviews.llvm.org/D41855

llvm-svn: 322360
2018-01-12 09:21:09 +00:00
Matthias Braun ea4359e922 PeepholeOptimizer: Fix for vregs without defs
The PeepholeOptimizer would fail for vregs without a definition. If this
was caused by an undef operand abort to keep the code simple (so we
don't need to add logic everywhere to replicate the undef flag).

Differential Revision: https://reviews.llvm.org/D40763

llvm-svn: 322319
2018-01-11 22:30:43 +00:00
Evgeniy Stepanov 5223b5d9d6 [arm] Implement Target Operand Flag MIR serialization.
Reviewers: efriedma, pcc

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D39975

llvm-svn: 322312
2018-01-11 21:37:58 +00:00
Diana Picus 0ed7513c83 [ARM GlobalISel] Map G_FNEG to the FPR bank
llvm-svn: 322169
2018-01-10 11:13:31 +00:00
Diana Picus f949a0abac [ARM GlobalISel] Legalize G_FNEG for s32 and s64
For hard float, it is legal.

For soft float, we need to lower to 0 - x first, and then we can use the
libcall for G_FSUB. This is undoing some of the canonicalization
performed by the IRTranslator (which introduces G_FNEG when it sees a
0 - x). Ideally, that canonicalization would be performed by a
pre-legalizer pass that would allow targets to opt out of this behaviour
rather than dance around it in the legalizer.

llvm-svn: 322168
2018-01-10 10:45:34 +00:00
Diana Picus 8f14886630 [ARM GlobalISel] Legalize s32/s64 G_FCONSTANT
Legal for hard float.
Change to G_CONSTANT for soft float (but preserve the binary
representation).

llvm-svn: 322164
2018-01-10 10:01:49 +00:00
Diana Picus 734a5e8912 [ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits
Make G_CONSTANT narrow for any scalars larger than 32 bits.

llvm-svn: 322162
2018-01-10 09:32:01 +00:00
Francis Visoiu Mistrih 7d9bef8f5c [CodeGen] Don't print "pred:" and "opt:" in -debug output
In -debug output we print "pred:" whenever a MachineOperand is a
predicate operand in the instruction descriptor, and "opt:" whenever a
MachineOperand is an optional def in the instruction descriptor.

Differential Revision: https://reviews.llvm.org/D41870

llvm-svn: 322096
2018-01-09 17:31:07 +00:00
Momchil Velikov ac7c5c1d92 [ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz
The patch makes the unwind information not mention registers, which were pushed
solely for the purpose of saving stack adjustment instructions.

Differential revision: https://reviews.llvm.org/D41300
Fixes https://bugs.llvm.org/show_bug.cgi?id=35379

llvm-svn: 321996
2018-01-08 14:47:19 +00:00
Momchil Velikov d17dabca31 [ARM] Fix PR35481
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.

Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481

llvm-svn: 321989
2018-01-08 11:32:37 +00:00
Reid Kleckner 5619669a5a Fix -Wsign-compare warnings on Windows
These arise because enums are 'int' by default.

llvm-svn: 321887
2018-01-05 19:53:51 +00:00
Momchil Velikov 7efdd090e2 [ARM] Issue an erorr when non-general-purpose registers are used in address operands
Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar.
This patch add checks that only general-purpose registers are used in address
operands, shifted registers, and shift amounts.

Differential revision: https://reviews.llvm.org/D39910

llvm-svn: 321866
2018-01-05 13:28:10 +00:00
Oliver Stannard 7d9198b296 [ARM] Fix endianness of Thumb .inst.w directive
Wide Thumb2 instructions should be emitted into the object file as pairs of
16-bit words of the appropriate endianness, not one 32-bit word.

Differential revision: https://reviews.llvm.org/D41185

llvm-svn: 321799
2018-01-04 13:56:40 +00:00
Diana Picus 865f7fecb2 [ARM GlobalISel] Select G_PHI
Select G_PHI to PHI and manually constrain the result register. This is
very similar to how COPY is handled, so extract and reuse some of that
code.

llvm-svn: 321797
2018-01-04 13:09:25 +00:00
Diana Picus c768bbe2e7 [ARM GlobalISel] Legalize scalar G_PHI
Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard
float. Widen any smaller types.

llvm-svn: 321795
2018-01-04 13:09:14 +00:00
Diana Picus 37ae9f68a4 [ARM GlobalISel] Fix selection of pointer constants
We used to handle G_CONSTANT with pointer type by forcing the type of
the result register to s32 and then letting TableGen handle it.
Unfortunately, setting the type only works for generic virtual
registers, that haven't yet been constrained to a register class (e.g.
those used only by a COPY later on). If the result register has already
been constrained as a use of a previously selected instruction, then
setting the type will assert.

It would be nice to be able to teach TableGen to select pointer
constants the same as integer constants, but since it's such an edge
case (at the moment the only pointer constant that we're generally
interested in is 0, and that is mostly used for comparisons and selects,
which are also not supported by TableGen) it's probably not worth the
effort right now. Instead, handle pointer constants with some trivial
handwritten code.

llvm-svn: 321793
2018-01-04 10:54:57 +00:00
Alex Bradbury 46db78b743 [ARM][NFC] Avoid recreating MCSubtargetInfo in ARMAsmBackend
After D41349, we can now directly access MCSubtargetInfo from 
createARM*AsmBackend. This patch makes use of this, avoiding the need to 
create a fresh MCSubtargetInfo (which was previously always done with a blank 
CPU and feature string). Given the total size of the change remains pretty 
tiny and we're removing the old explicit destructor, I changed the STI field 
to a reference rather than a pointer.

Differential Revision: https://reviews.llvm.org/D41693

llvm-svn: 321707
2018-01-03 13:46:21 +00:00
Alex Bradbury b22f751fa7 Thread MCSubtargetInfo through Target::createMCAsmBackend
Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. 
D20830 threaded an MCSubtargetInfo reference through 
MCAsmBackend::relaxInstruction, but this isn't the only function that would 
benefit from access. This patch removes the Triple and CPUString arguments 
from createMCAsmBackend and replaces them with MCSubtargetInfo.

This patch just changes the interface without making any intentional 
functional changes. Once in, several cleanups are possible:
* Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend
* Support 16-bit instructions when valid in MipsAsmBackend::writeNopData
* Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl
* Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221)

This change initially exposed PR35686, which has since been resolved in r321026.

Differential Revision: https://reviews.llvm.org/D41349

llvm-svn: 321692
2018-01-03 08:53:05 +00:00
Sanjoy Das 26d11ca4b0 (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321375
2017-12-22 18:21:59 +00:00
Diana Picus 28a6d0e639 [ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32
Mark conversions between pointers and 32-bit scalars as legal, map them
to the GPR and select to a simple COPY.

llvm-svn: 321356
2017-12-22 13:05:51 +00:00
Diana Picus 68773859c8 [ARM GlobalISel] Support pointer constants
Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.

llvm-svn: 321354
2017-12-22 11:09:18 +00:00
Eli Friedman 39ed9a602b [Inliner] Restrict soft-float inlining penalty.
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522

llvm-svn: 321332
2017-12-22 02:08:08 +00:00
Sam Parker 98727bc261 [ARM] Armv8-R DFB instruction
Implement MC support for the Armv8-R 'Data Full Barrier' instruction.

Differential Revision: https://reviews.llvm.org/D41430

llvm-svn: 321256
2017-12-21 11:17:49 +00:00
Sanjoy Das 747d1114d6 Revert "Expose a TargetMachine::getTargetTransformInfo function"
This reverts commit r321234.  It breaks the -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 321243
2017-12-21 02:34:39 +00:00
Sanjoy Das 0c3de350b4 Expose a TargetMachine::getTargetTransformInfo function
Summary:
This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321234
2017-12-21 01:06:58 +00:00
Joel Galenson 6f4e827e4c [ARM] Optimize {s,u}{add,sub}.with.overflow.
The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG.  This commit ports that code to the ARM backend.

Differential revision: https://reviews.llvm.org/D35635

llvm-svn: 321224
2017-12-20 22:25:39 +00:00
Diana Picus 75ce852abe [ARM GlobalISel] Fix assertion in RegBankSelect
We get an assertion in RegBankSelect for code along the lines of
my_32_bit_int = my_64_bit_int, which tends to translate into a 64-bit
load, followed by a G_TRUNC, followed by a 32-bit store. This appears in
a couple of places in the test-suite.

At the moment, the legalizer doesn't distinguish between integer and
floating point scalars, so a 64-bit load will be marked as legal for
targets with VFP, and so will the rest of the sequence, leading to a
slightly bizarre G_TRUNC reaching RegBankSelect.

Since the current support for 64-bit integers is rather immature, this
patch works around the issue by explicitly handling this case in
RegBankSelect and InstructionSelect. In the future, we may want to
revisit this decision and make sure 64-bit integer loads are narrowed
before reaching RegBankSelect.

llvm-svn: 321165
2017-12-20 11:27:10 +00:00
Florian Hahn c3aa6d83fd [ARM] Lower unsigned saturation to USAT
Summary:
Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it.

Patch by Marten Svanfeldt

Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41348

llvm-svn: 321164
2017-12-20 11:13:57 +00:00
Adrian Prantl 0e6694d111 Silence a bunch of implicit fallthrough warnings
llvm-svn: 321114
2017-12-19 22:05:25 +00:00
David Green 110844d21c [ARM] Register the Thumb2SizeReducePass. NFC
Also adds a simple test case.

llvm-svn: 321072
2017-12-19 12:19:08 +00:00
Matthias Braun a4852d2c19 X86/AArch64/ARM: Factor out common sincos_stret logic; NFCI
Note:
- X86ISelLowering: setLibcallName(SINCOS) was superfluous as
  InitLibcalls() already does it.
- ARMISelLowering: Setting libcallnames for sincos/sincosf seemed
  superfluous as in the darwin case it wouldn't be used while for all
  other cases InitLibcalls already does it.

llvm-svn: 321036
2017-12-18 23:19:42 +00:00
Diana Picus 8ee540c01a [ARM GlobalISel] Fix G_(UN)MERGE_VALUES handling after r319524
r319524 has made more G_MERGE_VALUES/G_UNMERGE_VALUES pairs legal than
are supported by the rest of the pipeline. Restrict that to only the
cases that we can currently handle: packing 32-bit values into 64-bit
ones, when we have hardware FP.

llvm-svn: 320980
2017-12-18 13:22:28 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Geoff Berry dcc646e40b [ARM] Fix isRenamable flag setting on expanded VSTMDIA opcode.
Fixes expensive-check ARM buildbot failure.

llvm-svn: 320718
2017-12-14 18:06:25 +00:00
Francis Visoiu Mistrih 5df3bbf3e6 [CodeGen] Print global addresses as @foo in both MIR and debug output
Work towards the unification of MIR and debug output by printing
`@foo` instead of `<ga:@foo>`.

Also print target flags in the MIR format since most of them are used on
global address operands.

Only debug syntax is affected.

llvm-svn: 320682
2017-12-14 10:03:09 +00:00
Michael Zolotukhin 67b04bd8ac Recover some overzealously removed includes.
llvm-svn: 320648
2017-12-13 22:21:02 +00:00
Michael Zolotukhin caf9ea6aa0 Remove redundant includes from lib/Target/ARM.
llvm-svn: 320635
2017-12-13 21:31:17 +00:00
Geoff Berry 60c431022e [MachineOperand][MIR] Add isRenamable to MachineOperand.
Summary:
Add isRenamable() predicate to MachineOperand.  This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand.  Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).

Reviewers: qcolombet, MatzeB, hfinkel

Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39400

llvm-svn: 320503
2017-12-12 17:53:59 +00:00
Roger Ferrer Ibanez 5ea0f2501f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564
 - fixes PR35103

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 320355
2017-12-11 12:13:45 +00:00
Francis Visoiu Mistrih a8a83d150f [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Nirav Dave 7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Pablo Barrio 2b4385846c Fix function pointer tail calls in armv8-M.base
Summary:
The compiler fails with the following error message:

fatal error: error in backend: ran out of registers during
register allocation

Tail call optimization for Armv8-M.base fails to meet all the required
constraints when handling calls to function pointers where the
arguments take up r0-r3. This is because the pointer to the
function to be called can only be stored in r0-r3, but these are
all occupied by arguments. This patch makes sure that tail call
optimization does not try to handle this type of calls.

Reviewers: chill, MatzeB, olista01, rengolin, efriedma

Reviewed By: olista01, efriedma

Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40706

llvm-svn: 319664
2017-12-04 16:55:49 +00:00
Oliver Stannard 7ab60605f8 Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands
This is causing a failure in the llvm-clang-x86_64-expensive-checks-win
buildbot, and I can't reproduce it locally, so reverting until I can work out
what is wrong.

llvm-svn: 319654
2017-12-04 13:42:22 +00:00
Oliver Stannard 7cd4db94f8 [Asm, ARM] Add fallback diag for multiple invalid operands
This adds a "invalid operands for instruction" diagnostic for
instructions where there is an instruction encoding with the correct
mnemonic and which is available for this target, but where multiple
operands do not match those which were provided. This makes it clear
that there is some combination of operands that is valid for the current
target, which the default diagnostic of "invalid instruction" does not.

Since this is a very general error, we only emit it if we don't have a
more specific error.

Differential revision: https://reviews.llvm.org/D36747

llvm-svn: 319649
2017-12-04 12:02:32 +00:00
Martin Storsjo c85cc41801 [ARM] Allow using emulated tls on platforms other than ELF
This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Differential Revision: https://reviews.llvm.org/D40769

llvm-svn: 319643
2017-12-04 09:08:55 +00:00
Nirav Dave 3e76e1e89e [DAG][ARM] Revert "Reenable post-legalize store merge"
due to failures in AArch and ARM code gen.

llvm-svn: 319587
2017-12-01 21:55:47 +00:00
Nirav Dave eb2b24fded [ARM][DAG] Reenable post-legalize store merge
Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case.

Reviewers: eastig, efriedma

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319547
2017-12-01 14:49:26 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Diana Picus f003d9ff95 [ARM GlobalISel] Bail out for byval
Fallback if we have a byval parameter or argument since we don't support
them yet.

llvm-svn: 319428
2017-11-30 12:23:44 +00:00
Francis Visoiu Mistrih 93ef145862 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

llvm-svn: 319427
2017-11-30 12:12:19 +00:00
Nirav Dave bafaa53c4d [ARM][DAG] Revert Disable post-legalization store merge for ARM
Partially reverting enabling of post-legalization store merge
(r319036) for just ARM backend as it is causing incorrect code
in some Thumb2 cases.

llvm-svn: 319331
2017-11-29 18:06:13 +00:00
Diana Picus 863b5b05f1 [ARM GlobalISel] Fix selecting G_BRCOND
When lowering a G_BRCOND, we generate a TSTri of the condition against
1, which sets the flags, and then a Bcc which branches based on the
value of the flags.

Unfortunately, we were using the wrong condition code to check whether
we need to branch (EQ instead of NE), which caused all our branches to
do the opposite of what they were intended to do. This patch fixes the
issue by using the correct condition code.

llvm-svn: 319313
2017-11-29 14:20:06 +00:00
Oliver Stannard 9ea2eaeb50 [ARM] Add support for armv7e-m to the .arch directive
This will allow compilation of assembly files targeting armv7e-m without having
to specify the Tag_CPU_arch attribute as a workaround.

Differential revision: https://reviews.llvm.org/D40370

Patch by Ian Tessier!

llvm-svn: 319303
2017-11-29 10:12:15 +00:00
Francis Visoiu Mistrih 9d7bb0cb40 [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Francis Visoiu Mistrih 9d419d3b0c [CodeGen] Rename functions PrintReg* to printReg*
LLVM Coding Standards:
  Function names should be verb phrases (as they represent actions), and
  command-like function should be imperative. The name should be camel
  case, and start with a lower case letter (e.g. openFile() or isFoo()).

Differential Revision: https://reviews.llvm.org/D40416

llvm-svn: 319168
2017-11-28 12:42:37 +00:00
Matthias Braun 5d01e708e1 ARM: Fix PR32578
https://llvm.org/PR32578

I simplified and converted the reproducer into a lit test.

Patch by Vedant Kumar!

llvm-svn: 319130
2017-11-28 01:17:52 +00:00
Momchil Velikov bd2c7eb923 [ARM] Fix an off-by-one error when restoring LR for 16-bit Thumb
The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to
restore LR from.

The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) +
count of the popped registers. We need to load LR from just past the last
register, hence the correct offset should be either getNumOperands() - 4 and
getNumExplicitOperands() - 2 (multiplied by 4).

Differential revision: https://reviews.llvm.org/D40305

llvm-svn: 319014
2017-11-27 10:13:14 +00:00
Diana Picus c01f7f131b [ARM GlobalISel] Support G_FDIV for s32 and s64
TableGen already generates code for selecting a G_FDIV, so we only need
to add a test.

For the legalizer and reg bank select, we do the same thing as for the
other floating point binary operations: either mark as legal if we have
a FP unit or lower to a libcall, and map to the floating point
registers.

llvm-svn: 318915
2017-11-23 13:26:07 +00:00
Diana Picus 9faa09b21e [ARM GlobalISel] Support G_FMUL for s32 and s64
TableGen already generates code for selecting a G_FMUL, so we only need
to add a test for that part.

For the legalizer and reg bank select, we do the same thing as the other
floating point binary operators: either mark as legal if we have a FP
unit or lower to a libcall, and map to the floating point registers.

llvm-svn: 318910
2017-11-23 12:44:20 +00:00
Oliver Stannard 9cb89f6611 [ARM] Remove pre-UAL FLDM/FSTM aliases
These are pre-UAL syntax, and we don't support any other pre-UAL instructions,
with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore
there's no reason to keep them or their AsmParser hacks around.

With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same
operand diagnostics as the UAL instructions.

Differential revision: https://reviews.llvm.org/D39196

llvm-svn: 318777
2017-11-21 16:20:25 +00:00
Oliver Stannard 1e6d4b9e62 [ARM] Don't omit non-default predication code
This was causing the (invalid) predicated versions of the NEON VRINTX and
VRINTZ instructions to be accepted, with the condition code being ignored.

Also, there is no NEON VRINTR instruction, so that part of the check was not
necessary.

Differential revision: https://reviews.llvm.org/D39193

llvm-svn: 318771
2017-11-21 15:34:15 +00:00
Oliver Stannard 1e73e95f3c [Asm] Improve "too few operands" errors
- We can still emit this error if the actual instruction has two or more
  operands missing compared to the expected one.
- We should only emit this error once per instruction.

Differential revision: https://reviews.llvm.org/D36746

llvm-svn: 318770
2017-11-21 15:16:50 +00:00
Oliver Stannard d6ca9879ba [ARM] Add diagnostics for SPR/DPR lists
Differential revision: https://reviews.llvm.org/D39195

llvm-svn: 318766
2017-11-21 15:06:01 +00:00
Martell Malone 51d82696db [ARM] Use SEH exceptions on thumbv7-windows
Reviewers: mstorsjo

Differential Revision: https://reviews.llvm.org/D40286

llvm-svn: 318756
2017-11-21 11:30:20 +00:00
Eugene Leviant 6bc35a93e6 [MI scheduler] Fix VADD and VSUB in cortex-a57 model
This patch fixes instregex for interger vector add/sub instructions

Differential revision: https://reviews.llvm.org/D40254

llvm-svn: 318749
2017-11-21 11:01:28 +00:00
Martin Storsjo b4c907edd7 [ARM] Use dwarf exception handling on MinGW
Enabling and using dwarf exceptions seems like an easier path
to take, than to make the COFF/ARM backend output EHABI directives.
Previously, no EH model was enabled at all on this target.

There's no point in setting UseIntegratedAssembler to false since
GNU binutils doesn't support Windows on ARM, and since we don't
need to support external assembler, we don't need to use register
numbers in cfi directives.

Differential Revision: https://reviews.llvm.org/D39532

llvm-svn: 318510
2017-11-17 08:04:40 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Yi Kong 39bcd4ed3e [ARM] 't' asm constraint should accept i32
't' constraint normally only accepts f32 operands, but for VCVT the
operands can be i32. LLVM is overly restrictive and rejects asm like:

  float foo() {
    float result;
    __asm__ __volatile__(
      "vcvt.f32.s32 %[result], %[arg1]\n"
      : [result]"=t"(result)
      : [arg1]"t"(0x01020304) );
    return result;
  }

Relax the value type for 't' constraint to either f32 or i32.

Differential Revision: https://reviews.llvm.org/D40137

llvm-svn: 318472
2017-11-16 23:38:17 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Daniel Sanders 725584e26d Add backend name to Target to enable runtime info to be fed back into TableGen
Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.

Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D39742

llvm-svn: 318352
2017-11-15 23:55:44 +00:00
Momchil Velikov 4a91fb93db [ARM] Split Arm jump table branch into i12 and rs suffixed versions
This is a refactoring/cleanup of Arm `addrmode2` operand class. The patch
removes it completely.

Differential Revision: https://reviews.llvm.org/D39832

llvm-svn: 318291
2017-11-15 12:02:55 +00:00
Martin Storsjo 4629f52312 [ARM, AArch64] Fix an assert message, Darwin isn't the only target supporting TLS. NFC.
llvm-svn: 318184
2017-11-14 19:57:59 +00:00
Tim Northover 5cdc4f9c33 ARM: correctly update CFG when splitting BB to fix branch.
Because the block-splitting code is multi-purpose, we have to meddle with the
branches when using it to fixup a conditional branch destination. We got the
code right, but forgot to update the CFG so the verifier complained when
expensive checks were on.

Probably harmless since constant-islands comes so late, but best to fix it
anyway.

llvm-svn: 318148
2017-11-14 11:43:54 +00:00
Diana Picus 21a42bcc0b [ARM GlobalISel] Remove C++ code for G_CONSTANT
Get rid of the handwritten instruction selector code for handling
G_CONSTANT. This code wasn't checking all the preconditions correctly
anyway, so it's better to leave it to TableGen, which can handle at
least some cases correctly (e.g. MOVi, MOVi16, folding into binary
operations). Also add tests to cover those cases.

llvm-svn: 318146
2017-11-14 11:20:32 +00:00
Momchil Velikov dc86e1444d [ARM] Fix incorrect conversion of a tail call to an ordinary call
When we emit a tail call for Armv8-M, but then discover that the caller needs to
save/restore `LR`, we convert the tail call to an ordinary one, since restoring
`LR` takes extra instructions, which may negate the benefits of the tail
call. If the callee, however, takes stack arguments, this conversion is
incorrect, since nothing has been done to pass the stack arguments.

Thus the patch reverts https://reviews.llvm.org/rL294000

Also, we improve the instruction sequence for popping `LR` in the case when we
couldn't immediately find a scratch low register, but we can use as a temporary
one of the callee-saved low registers and restore `LR` before popping other
callee-saves.

Differential Revision: https://reviews.llvm.org/D39599

llvm-svn: 318143
2017-11-14 10:36:52 +00:00
Evgeniy Stepanov 76d5ac4906 [arm] Fix Unnecessary reloads from GOT.
Summary:
This fixes PR35221.
Use pseudo-instructions to let MachineCSE hoist global address computation.

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39871

llvm-svn: 318081
2017-11-13 20:45:38 +00:00
Momchil Velikov 842aa90192 [ARM] Place jump table as the first operand in additions
When generating table jump code for switch statements, place the jump
table label as the first operand in the various addition instructions
in order to enable addressing mode selectors to better match index
computation and possibly fold them into the addressing mode of the
table entry load instruction.

Differential revision: https://reviews.llvm.org/D39752

llvm-svn: 318033
2017-11-13 11:56:48 +00:00
Mandeep Singh Grang d104673257 [llvm] Remove redundant return [NFC]
Reviewers: davidxl, olista01, Eugene.Zelenko

Reviewed By: Eugene.Zelenko

Subscribers: sdardis, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39917

llvm-svn: 317995
2017-11-12 03:47:50 +00:00
Jonas Paulsson 4b017e682d [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Kristof Beyls af9814a1fc [GlobalISel] Enable legalizing non-power-of-2 sized types.
This changes the interface of how targets describe how to legalize, see
the below description.

1. Interface for targets to describe how to legalize.

In GlobalISel, the API in the LegalizerInfo class is the main interface
for targets to specify which types are legal for which operations, and
what to do to turn illegal type/operation combinations into legal ones.

For each operation the type sizes that can be legalized without having
to change the size of the type are specified with a call to setAction.
This isn't different to how GlobalISel worked before. For example, for a
target that supports 32 and 64 bit adds natively:

  for (auto Ty : {s32, s64})
    setAction({G_ADD, 0, s32}, Legal);

or for a target that needs a library call for a 32 bit division:

  setAction({G_SDIV, s32}, Libcall);

The main conceptual change to the LegalizerInfo API, is in specifying
how to legalize the type sizes for which a change of size is needed. For
example, in the above example, how to specify how all types from i1 to
i8388607 (apart from s32 and s64 which are legal) need to be legalized
and expressed in terms of operations on the available legal sizes
(again, i32 and i64 in this case). Before, the implementation only
allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0,
s128}, NarrowScalar).  A worse limitation was that if you'd wanted to
specify how to legalize all the sized types as allowed by the LLVM-IR
LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times
and probably would need a lot of memory to store all of these
specifications.

Instead, the legalization actions that need to change the size of the
type are specified now using a "SizeChangeStrategy".  For example:

   setLegalizeScalarToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerAndNarrowToLargest);

This example indicates that for type sizes for which there is a larger
size that can be legalized towards, do it by Widening the size.
For example, G_ADD on s17 will be legalized by first doing WidenScalar
to make it s32, after which it's legal.
The "NarrowToLargest" indicates what to do if there is no larger size
that can be legalized towards. E.g. G_ADD on s92 will be legalized by
doing NarrowScalar to s64.

Another example, taken from the ARM backend is:
   for (unsigned Op : {G_SDIV, G_UDIV}) {
     setLegalizeScalarToDifferentSizeStrategy(Op, 0,
         widenToLargerTypesUnsupportedOtherwise);
     if (ST.hasDivideInARMMode())
       setAction({Op, s32}, Legal);
     else
       setAction({Op, s32}, Libcall);
   }

For this example, G_SDIV on s8, on a target without a divide
instruction, would be legalized by first doing action (WidenScalar,
s32), followed by (Libcall, s32).

The same principle is also followed for when the number of vector lanes
on vector data types need to be changed, e.g.:

   setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal);
   setLegalizeVectorElementToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerTypesUnsupportedOtherwise);

As currently implemented here, vector types are legalized by first
making the vector element size legal, followed by then making the number
of lanes legal. The strategy to follow in the first step is set by a
call to setLegalizeVectorElementToDifferentSizeStrategy, see example
above.  The strategy followed in the second step
"moreToWiderTypesAndLessToWidest" (see code for its definition),
indicating that vectors are widened to more elements so they map to
natively supported vector widths, or when there isn't a legal wider
vector, split the vector to map it to the widest vector supported.

Therefore, for the above specification, some example legalizations are:
  * getAction({G_ADD, LLT::vector(3, 3)})
    returns {WidenScalar, LLT::vector(3, 8)}
  * getAction({G_ADD, LLT::vector(3, 8)})
    then returns {MoreElements, LLT::vector(8, 8)}
  * getAction({G_ADD, LLT::vector(20, 8)})
    returns {FewerElements, LLT::vector(16, 8)}


2. Key implementation aspects.

How to legalize a specific (operation, type index, size) tuple is
represented by mapping intervals of integers representing a range of
size types to an action to take, e.g.:

       setScalarAction({G_ADD, LLT:scalar(1)},
                       {{1, WidenScalar},  // bit sizes [ 1, 31[
                        {32, Legal},       // bit sizes [32, 33[
                        {33, WidenScalar}, // bit sizes [33, 64[
                        {64, Legal},       // bit sizes [64, 65[
                        {65, NarrowScalar} // bit sizes [65, +inf[
                       });

Please note that most of the code to do the actual lowering of
non-power-of-2 sized types is currently missing, this is just trying to
make it possible for targets to specify what is legal, and how non-legal
types should be legalized.  Probably quite a bit of further work is
needed in the actual legalizing and the other passes in GlobalISel to
support non-power-of-2 sized types.

I hope the documentation in LegalizerInfo.h and the examples provided in the
various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well
enough how this is meant to be used.

This drops the need for LLT::{half,double}...Size().


Differential Revision: https://reviews.llvm.org/D30529

llvm-svn: 317560
2017-11-07 10:34:34 +00:00
David Blaikie 1be62f0327 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Diana Picus acf4bf21ab [ARM GlobalISel] Move the check for Thumb higher up
We're currently bailing out for Thumb targets while lowering formal
parameters, but there used to be some other checks before it, which
could've caused some functions (e.g. those without formal parameters) to
sneak through unnoticed.

llvm-svn: 317312
2017-11-03 10:30:12 +00:00
Sam Parker 242052c6b4 [ARM] and, or, xor and add with shl combine
The generic dag combiner will fold:

(shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
(shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)

This can create constants which are too large to use as an immediate.
Many ALU operations are also able of performing the shl, so we can
unfold the transformation to prevent a mov imm instruction from being
generated.

Other patterns, such as b + ((a << 1) | 510), can also be simplified
in the same manner.

Differential Revision: https://reviews.llvm.org/D38084

llvm-svn: 317197
2017-11-02 10:43:10 +00:00
Roger Ferrer Ibanez 9dfbc10522 Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY"
That change causes PR35103, so reverting until I figure it out.

llvm-svn: 317092
2017-11-01 14:06:57 +00:00
Javed Absar 5cde1ccb29 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261

llvm-svn: 316902
2017-10-30 13:51:56 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
David Blaikie 8699f71310 Add a few missing headers for modularization/IWYU/etc
Several cases where class definitions are required for DenseMap pointer
traits handling.

llvm-svn: 316803
2017-10-27 22:12:46 +00:00
David Blaikie 6265130054 InstructionSelectorImpl.h: Modularize/remove ODR violations by using a static member function to expose the debug name
llvm-svn: 316715
2017-10-26 23:39:54 +00:00
Eli Friedman d5dfb62de7 [ARM] Honor -mfloat-abi for libcall calling convention
As far as I can tell, this matches gcc: -mfloat-abi determines the
calling convention for all functions except those explicitly defined as
soft-float in the ARM RTABI.

This change only affects cases where the user specifies -mfloat-abi to
override the default calling convention derived from the target triple.

Fixes https://bugs.llvm.org//show_bug.cgi?id=34530.

Differential Revision: https://reviews.llvm.org/D38299

llvm-svn: 316708
2017-10-26 21:42:32 +00:00
Yichao Yu 221dae31a5 Clear LastMappingSymbols and LastEMS(Info) when resetting the ARM(AArch64)ELFStreamer
Summary:
This causes a segfault on ARM when (I think) the pass manager is used multiple times.

Reset set the (last) current section to NULL without saving the corresponding LastEMSInfo back into the map. The next use of the streamer then save the LastEMSInfo for the NULL section leaving the LastEMSInfo mapping for the last current section (the one that was there before the reset) NULL which cause the LastEMSInfo to be set to NULL when the section is being used again.

The reuse of the section (pointer) might mean that the map was holding dangling pointers previously which is why I went for clearing the map and resetting the info, making it as similar to the state right after the constructor run as possible. The AArch64 one doesn't have segfault (since LastEMS isn't a pointer) but it seems to have the same issue.

The segfault is likely caused by https://reviews.llvm.org/D30724 which turns LastEMSInfo into a pointer. As mentioned above, it seems that the actual issue was older though.

No test is included since the test is believed to be too complicated for such an obvious fix and not worth doing.

Reviewers: llvm-commits, shankare, t.p.northover, peter.smith, rengolin

Reviewed By: rengolin

Subscribers: mgorny, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38588

llvm-svn: 316679
2017-10-26 17:36:43 +00:00
Craig Topper 0551556ed2 [AsmParser][TableGen] Add VariantID argument to the generated mnemonic spell check function so it can use the correct table based on variant.
I'm considering implementing the mnemonic spell checker for x86, and that would require the separate intel and att variants.

llvm-svn: 316641
2017-10-26 06:46:41 +00:00
Craig Topper 2a06028c0a [AsmParser][TableGen] Make the generated mnemonic spell checker function a file local static function.
Also only emit in targets that specificially request it. This is required so we don't get an unused static function error.

llvm-svn: 316640
2017-10-26 06:46:40 +00:00
Diana Picus b35022121d [ARM GlobalISel] Fix call opcodes
We were generating BLX for all the calls, which was incorrect in most
cases. Update ARMCallLowering to generate BL for direct calls, and BLX,
BX_CALL or BMOVPCRX_CALL for indirect calls.

llvm-svn: 316570
2017-10-25 11:42:40 +00:00
Sam Parker 1f742117bd [ARM] OrCombineToBFI function
Extract the functionality to combine OR to BFI into its own function.

Differential Revision: https://reviews.llvm.org/D39001

llvm-svn: 316563
2017-10-25 08:37:33 +00:00
Sam Parker ccb209bb97 [ARM] Swap cmp operands for automatic shifts
Swap the compare operands if the lhs is a shift and the rhs isn't,
as in arm and T2 the shift can be performed by the compare for its
second operand.

Differential Revision: https://reviews.llvm.org/D39004

llvm-svn: 316562
2017-10-25 08:33:06 +00:00
David Blaikie c70b392e49 ARMAddressingModes.h: Don't mark header functions as file local
llvm-svn: 316517
2017-10-24 21:29:21 +00:00
Oliver Stannard 03ded27bbc [ARM] Error for invalid shift in memory operand
Report a diagnostic when we fail to parse a shift in a memory operand because
the shift type is not an identifier. Without this, we were silently ignoring
the whole instruction.

Differential revision: https://reviews.llvm.org/D39237

llvm-svn: 316441
2017-10-24 14:19:08 +00:00
Oliver Stannard ce256a3a01 [ARM] Replace development diagnostics with normal DEBUG macro
* Remove the -arm-asm-parser-dev-diags option.
* Use normal DEBUG(dbgs()) printing for the extra development information about
  missing diagnostics.

Differential Revision: https://reviews.llvm.org/D39194

llvm-svn: 316423
2017-10-24 09:46:56 +00:00
Oliver Stannard 6d5a5b98ab [ARM] tSETEND needs IsThumb
This is the Thumb encoding, so the Requires list must include IsThumb.

No test because we happen to select the ARM one first, but that's just luck.

Differential Revision: https://reviews.llvm.org/D39190

llvm-svn: 316421
2017-10-24 09:03:33 +00:00
Oliver Stannard c507b370a1 [ARM] Remove tCPS alias which just crashed
This alias caused a crash when trying to print the "cps #0" instruction in a
diagnostic for thumbv6 (which doesn't have that instruction).
	    
The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits
are set, so I don't think it's worth keeping.

Differential Revision: https://reviews.llvm.org/D39191

llvm-svn: 316420
2017-10-24 08:55:36 +00:00
Sam Parker 487ab86942 [ARM] Allow unrolling of multi-block loops.
Before, loop unrolling was only enabled for loops with a single
block. This restriction has been removed and replaced by:
- allow a maximum of two exiting blocks,
- a four basic block limit for cores with a branch predictor.

Differential Revision: https://reviews.llvm.org/D38952

llvm-svn: 316313
2017-10-23 08:05:14 +00:00
Momchil Velikov d6a4ab3d49 [ARM] Dynamic stack alignment for 16-bit Thumb
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.

Differential revision: https://reviews.llvm.org/D38143

llvm-svn: 316289
2017-10-22 11:56:35 +00:00
Eugene Leviant 27b226fb65 [ARM] Use post-RA MI scheduler when +use-misched is set
Differential revision: https://reviews.llvm.org/D39100

llvm-svn: 316214
2017-10-20 14:29:17 +00:00
Andre Vieira d4a25707f0 [ARM] Fix disassembly for conditional VMRS and VMSR instructions in ARM mode
Differential Revision: https://reviews.llvm.org/D38347

llvm-svn: 316085
2017-10-18 14:47:37 +00:00
NAKAMURA Takumi 6f43bd4bde Untabify.
llvm-svn: 316079
2017-10-18 13:31:28 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Matthias Braun bb8507e63c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun 3a9c114b24 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Lang Hames 2241ffa43c [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315531
2017-10-11 23:34:47 +00:00
Oliver Stannard 4191b9eaea [Asm] Add debug tracing in table-generated assembly matcher
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.

The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.

llvm-svn: 315445
2017-10-11 09:17:43 +00:00
Lang Hames 02d330548d [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315410
2017-10-11 01:57:21 +00:00
Lang Hames 232cdb48fc [MC] Add another missing <memory> include left out of r315327.
llvm-svn: 315332
2017-10-10 16:59:01 +00:00
Lang Hames 60fbc7cc38 [MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriter
functions.

This makes the ownership of the resulting MCObjectWriter clear, and allows us
to remove one instance of MCObjectStreamer's bizarre "holding ownership via
someone else's reference" trick.

llvm-svn: 315327
2017-10-10 16:28:07 +00:00
Oliver Stannard 30b732c942 [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs
Previously, the code that implemented the GNU assembler aliases for the
LDRD and STRD instructions (where the second register is omitted)
assumed that the input was a valid instruction. This caused assertion
failures for every example in ldrd-strd-gnu-bad-inst.s.

This improves this code so that it bails out if the instruction is not
in the expected format, the check bails out, and the asm parser is run
on the unmodified instruction.

It also relaxes the alias on thumb targets, so that unaligned pairs of
registers can be used. The restriction that Rt must be even-numbered
only applies to the ARM versions of these instructions.

Differential revision: https://reviews.llvm.org/D36732

llvm-svn: 315305
2017-10-10 12:38:22 +00:00
Oliver Stannard cd3306f62f [ARM, Asm] Add diagnostics for floating-point register operands
This adds diagnostic strings for the ARM floating-point register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, DPR, requires C++ code to select the correct error
message, as that class contains different registers depending on the
FPU. The rest can all have their diagnostic strings stored in the
tablegen decription of them.

Differential revision: https://reviews.llvm.org/D36693

llvm-svn: 315304
2017-10-10 12:35:09 +00:00
Oliver Stannard bbad419e94 [ARM, Asm] Add diagnostics for general-purpose register operands
This adds diagnostic strings for the ARM general-purpose register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, rGPR, requires C++ code to select the correct error
message, as that class contains different registers in pre-v8 and v8
targets. The rest can all have their diagnostic strings stored in the
tablegen description of them.

Differential revision: https://reviews.llvm.org/D36692

llvm-svn: 315303
2017-10-10 12:31:53 +00:00
Lang Hames 77dff39cb4 [MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> through
createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor.

Fixes the same ownership issue for COFF that r315245 did for MachO:
WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we
want to pass this through to the constructor via a unique_ptr, rather than a
raw ptr.

llvm-svn: 315257
2017-10-10 00:50:29 +00:00
Lang Hames dcb312bdb9 [MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter to
ELFObjectWriter's constructor.

Fixes the same ownership issue for ELF that r315245 did for MachO:
ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to
pass this through to the constructor via a unique_ptr, rather than a raw ptr.

llvm-svn: 315254
2017-10-09 23:53:15 +00:00
Lang Hames 9b206a7d60 [MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriter
to MCObjectWriter's constructor.

MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this
patch plumbs that ownership relationship through the constructor (which
previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter
function.

llvm-svn: 315245
2017-10-09 22:38:13 +00:00
Aditya Nandakumar c3bfc81a1f [GISel]: Fix generation of illegal COPYs during CallLowering
We end up creating COPY's that are either truncating/extending and this
should be illegal.

https://reviews.llvm.org/D37640

Patch for X86 and ARM by igorb, rovka

llvm-svn: 315240
2017-10-09 20:07:43 +00:00
Diana Picus e393bc72ee [ARM] GlobalISel: Select shifts
Unfortunately TableGen doesn't handle this yet:
Unable to deduce gMIR opcode to handle Src (which is a leaf).

Just add some temporary hand-written code to generate the proper MOVsr.

llvm-svn: 315071
2017-10-06 15:39:16 +00:00
Diana Picus a81a4b17e5 [ARM] GlobalISel: Map shift operands to GPRs
llvm-svn: 315067
2017-10-06 14:52:43 +00:00
Diana Picus 2c95730450 [ARM] GlobalISel: Mark shifts as legal for s32
The new legalize combiner introduces shifts all over the place, so we
should support them sooner rather than later.

llvm-svn: 315064
2017-10-06 14:30:05 +00:00
Oliver Stannard 878216dd05 [ARM] Add diag string for movw/movt immediates in assembly
This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

llvm-svn: 314888
2017-10-04 09:24:54 +00:00
Oliver Stannard 5a7aae3a80 [ARM, Asm] Change grammar of immediate operand diagnostics
Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

llvm-svn: 314887
2017-10-04 09:18:07 +00:00
Oliver Stannard 0d5c792223 [ARM] Use table-gen'd assembly operand diags in ARM asm parser
This switches the ARM AsmParser to use assembly operand diagnostics from
tablegen, rather than a switch statement on the ARMMatchResultTy. It
moves the existing diagnostic strings to tablegen, but adds no new ones,
so this is NFC except for one diagnostic string that had an off-by-1 error
in the hand-written switch statement.

Differential revision: https://reviews.llvm.org/D31607

llvm-svn: 314804
2017-10-03 14:38:52 +00:00
Oliver Stannard 55114fd9f0 [ARM, Asm] Use correct source location for register tokens
tryParseRegister advances the lexer, so we need to take copies of the start and
end locations of the register operand before calling it.

Previously, the caret in the diagnostic pointer to the comma after the r0
operand in the test, rather than the start of the operand.

Differential revision: https://reviews.llvm.org/D31537

llvm-svn: 314799
2017-10-03 14:30:58 +00:00
Oliver Stannard 68aa7de517 [ARM, Asm] Fix ubsan failure caused by out-of-range enum value
In this code, we use ~0U as a sentinel value for any operand class that doesn't
have a user-friendly error message, but this value isn't in range of the
MatchClassKind enum, so we need to ensure it does not get passed to isSubclass.

llvm-svn: 314793
2017-10-03 12:45:18 +00:00
Oliver Stannard 5daee987fd [ARM, Asm] Remove dead code causing MSan failure.
r314779 caused ErrorInfo to be red uninitialised, but also made this code dead,
so it can just be removed.

llvm-svn: 314791
2017-10-03 12:28:28 +00:00
Oliver Stannard e093bad472 [ARM] Use new assembler diags for ARM
This converts the ARM AsmParser to use the new assembly matcher error
reporting mechanism, which allows errors to be reported for multiple
instruction encodings when it is ambiguous which one the user intended
to use.

By itself this doesn't improve many error messages, because we don't have
diagnostic text for most operand types, but as we add that then this will allow
more of those diagnostic strings to be used when they are relevant.

Differential revision: https://reviews.llvm.org/D31530

llvm-svn: 314779
2017-10-03 10:26:11 +00:00
Sjoerd Meijer 7a22a4948f ISel type legalization: add debug messages. NFCI.
This adds some more debug messages to the type legalizer and functions
like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make
the debug messages a little bit more informative and useful.

Differential Revision: https://reviews.llvm.org/D38450

llvm-svn: 314773
2017-10-03 08:54:15 +00:00
Jonas Paulsson c9e363ac69 [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Sam Parker 963da5b119 [ARM] v8.3-a complex number support
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers, where
the complex numbers are packed in a vector register as a pair of
elements. The Imaginary part of the number is placed in the more
significant element, and the Real part of the number is placed in the
less significant element.

This patch adds assembler for the ARM target.

Differential Revision: https://reviews.llvm.org/D36789

llvm-svn: 314511
2017-09-29 13:11:33 +00:00
Matthias Braun 51687912a4 ARM: Fix cases where CSI Restored bit is not cleared
LR is an untypical callee saved register in that it is restored into a
different register (PC) and thus does not live-out of the return block.
This case requires the `Restored` flag in CalleeSavedInfo to be cleared.

This fixes a number of cases where this wasn't handled correctly yet.

llvm-svn: 314471
2017-09-28 23:12:06 +00:00
Martin Storsjo d6218cc385 [ARM] Restore the right frame pointer register in Int_eh_sjlj_longjmp
In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp,
we fetch and store the actual frame pointer, but on return via
the longjmp intrinsic, it always was restored into the r7 variable.

On windows, the frame pointer should be restored into r11 instead of r7.

On Darwin (where sjlj exception handling is used by default), the frame
pointer is always r7, both in arm and thumb mode, and likewise, on
windows, the frame pointer always is r11.

On linux however, if sjlj exception handling is enabled (which it isn't
by default), libcxxabi and the user code can be built in differing modes
using different registers as frame pointer. Therefore, when restoring
registers on a platform where we don't always use the same register
depending on code mode, restore both r7 and r11.

Differential Revision: https://reviews.llvm.org/D38253

llvm-svn: 314451
2017-09-28 19:04:30 +00:00
Martin Storsjo adceba59a2 [ARM] Fix SJLJ exception handling when manually chosen on a platform where it isn't default
Differential Revision: https://reviews.llvm.org/D38252

llvm-svn: 314450
2017-09-28 19:04:14 +00:00
Sam Parker 211f47aa37 [ARM] isTruncateFree fix
I implemented isTruncateFree in rL313533, this patch fixes the logic
to match my comment, as the previous logic was too general. Now the
only truncates that are free are i64 -> i32.

Differential Revision: https://reviews.llvm.org/D38234

llvm-svn: 314280
2017-09-27 08:30:45 +00:00
Eli Friedman edee9999c4 Revert r312724 ("[ARM] Remove redundant vcvt patterns.").
It leads to some improvements, but also a regression for the simple
case, so it's not clearly a good idea.

test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference.

Ultimately, the right solution is probably to custom-lower fp-to-int
conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast.
It's hard to do the right thing when the implicit bitcast isn't visible
to DAG transforms.

llvm-svn: 314169
2017-09-25 22:07:33 +00:00
Craig Topper 5bc10ede53 [SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors
This teach simplifyDemandedBits to handle constant splat vector shifts.

This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount.

I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here.

I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change.

Differential Revision: https://reviews.llvm.org/D37665

llvm-svn: 314139
2017-09-25 19:26:08 +00:00
Arnold Schwaighofer b45717adda ARM: One more fix for swifterror CSR set
We use a differently ordered CSR set if the frame pointer is pushed. Add a
matching ..._SwiftError version.

llvm-svn: 314128
2017-09-25 17:51:33 +00:00
Benjamin Kramer a23c1a37d0 [ARM] Fix -Wdangling-else warning.
A ternary is clearer here. No functionality change.

llvm-svn: 314123
2017-09-25 17:35:38 +00:00
Arnold Schwaighofer ae4de58a5b ARM: Use the proper swifterror CSR list on platforms other than darwin
Noticed by inspection

llvm-svn: 314121
2017-09-25 17:19:50 +00:00
Andre Vieira 640527f7f1 [ARM] Fix assembly and disassembly for VMRS/VMSR
Reviewed by: t.p.northover
Differential Revision: https://reviews.llvm.org/D36306

llvm-svn: 313979
2017-09-22 12:17:42 +00:00
Simon Pilgrim 2b1c3bb25d [ARM] Add missing selection patterns for vnmla
For the following function:

  double fn1(double d0, double d1, double d2) {
    double a = -d0 - d1 * d2;
    return a;
  }

on ARM, LLVM generates code along the lines of

  vneg.f64  d0, d0
  vmls.f64  d0, d1, d2

i.e., a negate and a multiply-subtract.

The attached patch adds instruction selection patterns to allow it to generate the single instruction

  vnmla.f64  d0, d1, d2

(multiply-add with negation) instead, like GCC does.

Committed on behalf of @gergo- (Gergö Barany)

Differential Revision: https://reviews.llvm.org/D35911

llvm-svn: 313972
2017-09-22 09:50:52 +00:00
Eugene Zelenko 076468c0d0 [ARM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313823
2017-09-20 21:35:51 +00:00
Jonathan Roelofs 85908aa84b [ARM] Relax 'cpsie'/'cpsid' flag parsing.
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.

https://reviews.llvm.org/D37953

llvm-svn: 313680
2017-09-19 21:23:19 +00:00
Roger Ferrer Ibanez 8d0180c955 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313618
2017-09-19 09:05:39 +00:00
Sam Parker 71efbe4c68 [ARM] Implement isTruncateFree
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.

Differential Revision: https://reviews.llvm.org/D37516

llvm-svn: 313533
2017-09-18 14:28:51 +00:00
Sjoerd Meijer 4e6df15962 [ARM] Fix for indexed dot product instruction descriptions
The indexed dot product instructions only accept the lower 16 D-registers as
the indexed register, but we were e.g. incorrectly accepting:

vudot.u8 d16,d16,d18[0]

Differential Revision: https://reviews.llvm.org/D37968

llvm-svn: 313531
2017-09-18 14:17:57 +00:00
Hans Wennborg 8c1eb106bd Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY"
This was causing PR34045 to fire again.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

Also revert follow-up r313010:

> [ARM] Fix typo when creating ISD::SUB nodes
>
> In D35192, I accidentally introduced a typo when creating ISD::SUB nodes,
> giving them two values instead of one.
>
> This fails when the merge_values combiner finds one of these nodes.
>
> This change fixes PR34564.
>
> Differential Revision: https://reviews.llvm.org/D37690

llvm-svn: 313044
2017-09-12 16:24:17 +00:00
Roger Ferrer Ibanez 9df2527b0b [ARM] Fix typo when creating ISD::SUB nodes
In D35192, I accidentally introduced a typo when creating ISD::SUB nodes,
giving them two values instead of one.

This fails when the merge_values combiner finds one of these nodes.

This change fixes PR34564.

Differential Revision: https://reviews.llvm.org/D37690

llvm-svn: 313010
2017-09-12 07:42:28 +00:00
Roger Ferrer Ibanez 4f92b4162f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313009
2017-09-12 07:40:09 +00:00
Hans Wennborg 075e5a2e2b Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY"
It caused PR34564.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312980
2017-09-11 23:52:02 +00:00
Andre Vieira c429aabb91 [ARM] Enable the use of SVC anywhere in an IT block
Differential Revision: https://reviews.llvm.org/D37374

llvm-svn: 312908
2017-09-11 11:11:17 +00:00
Roger Ferrer Ibanez 12b20f2307 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312898
2017-09-11 07:38:05 +00:00
Benjamin Kramer 6ef976d5e1 [ARM] Remove redundant vcvt patterns.
These don't add any value as they're just compositions of existing
patterns. However, they can confuse the cost logic in ISel, leading to
duplicated vcvt instructions like in PR33199.

llvm-svn: 312724
2017-09-07 14:52:26 +00:00
Saleem Abdulrasool 5fba8ba9cc ARM: track globals promoted to coalesced const pool entries
Globals that are promoted to an ARM constant pool may alias with another
existing constant pool entry. We need to keep a reference to all globals
that were promoted to each constant pool value so that we can emit a
distinct label for each promoted global. These labels are necessary so
that debug info can refer to the promoted global without an undefined
reference during linking.

Patch by Stephen Crane!

llvm-svn: 312692
2017-09-07 04:00:13 +00:00
Matthias Braun c9056b834d Insert IMPLICIT_DEFS for undef uses in tail merging
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
  two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D37034

llvm-svn: 312668
2017-09-06 20:45:24 +00:00
Eli Friedman c22c699882 [ARM] Make ARMExpandPseudo add implicit uses for predicated instructions
Missing these could potentially screw up post-ra scheduling.

Issue found by inspection, so I don't have a real testcase. Included
test just verifies the expected operands after expansion.

Differential Revision: https://reviews.llvm.org/D35156

llvm-svn: 312589
2017-09-05 22:54:06 +00:00
Eli Friedman 06d0ee734a [ARM] Register ARMExpandPseudo pass.
This allows -run-pass etc. to refer to it.

(Split off from D35156.)

llvm-svn: 312587
2017-09-05 22:45:23 +00:00
Diana Picus ac15473cdd [ARM] GlobalISel: Minor cleanups in inst selector
Use the STI member of ARMInstructionSelector instead of
TII.getSubtarget() and also make use of STI's methods instead of
checking the object format manually.

llvm-svn: 312522
2017-09-05 08:22:47 +00:00
Diana Picus abb088691b [ARM] GlobalISel: Support global variables for RWPI
In RWPI code, globals that are not read-only are accessed relative to
the SB register (R9). This is achieved by explicitly generating an ADD
instruction between SB and an offset that we either load from a constant
pool or movw + movt into a register.

llvm-svn: 312521
2017-09-05 07:57:41 +00:00
Diana Picus f959791189 [ARM] GlobalISel: Support ROPI global variables
In the ROPI relocation model, read-only variables are accessed relative
to the PC. We use the (MOV|LDRLIT)_ga_pcrel pseudoinstructions for this.

llvm-svn: 312323
2017-09-01 11:13:39 +00:00
Oliver Stannard d771f6cb16 [ARM] Add 2-operand assembly aliases for Thumb1 ADD/SUB
This adds 2-operand assembly aliases for these instructions:
  add r0, r1    =>   add r0, r0, r1
  sub r0, r1    =>   sub r0, r0, r1

Previously this syntax was only accepted for Thumb2 targets, where the
wide versions of the instructions were used.

This patch allows the 2-operand syntax to be used for Thumb1 targets,
and selects the narrow encoding when it is used for Thumb2 targets.

Differential revision: https://reviews.llvm.org/D37377

llvm-svn: 312321
2017-09-01 10:47:25 +00:00
Diana Picus 1d101d76c0 Move static helper into ARMTargetLowering. NFC
This exposes the isReadOnly(GlobalValue *) in the ARMTargetLowering so
we can make use of it in GlobalISel as well.

llvm-svn: 312320
2017-09-01 10:44:48 +00:00
Sam Parker b036757f3d [ARM] Reverse PostRASched subtarget feature logic
Replace the UsePostRAScheduler SubtargetFeature with
DisablePostRAScheduler, which is then used by Swift and Cyclone.
This patch maintains enabling PostRA scheduling for other Thumb2
capable cores and/or for functions which are being compiled in Arm
mode.

Differential Revision: https://reviews.llvm.org/D37055

llvm-svn: 312226
2017-08-31 08:57:51 +00:00
Benjamin Kramer 79d53febcf [ARM] Replace fixed-size SmallSet with a bitset.
It's smaller. No functionality change.

llvm-svn: 312180
2017-08-30 22:28:30 +00:00
Brian Gesiak 3332976478 [ARM] Use Swift error registers on non-Darwin targets
Summary:
Remove a check for `ARMSubtarget::isTargetDarwin` when determining
whether to use Swift error registers, so that Swift errors work
properly on non-Darwin ARM32 targets (specifically Android).

Before this patch, generated code would save and restores ARM register r8 at
the entry and returns of a function that throws. As r8 is used as a virtual
return value for the object being thrown, this gets overwritten by the restore,
and calling code is unable to catch the error. In turn this caused Swift code
that used `do`/`try`/`catch` to work improperly on Android ARM32 targets.

Addresses Swift bug report https://bugs.swift.org/browse/SR-5438.

Patch by John Holdsworth.

Reviewers: manmanren, rjmccall, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: srhines, aschwaighofer, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35835

llvm-svn: 312164
2017-08-30 20:03:54 +00:00
Javed Absar 5766b8eea0 [ARM] - Tidy-up ARMAsmPrinter.cpp
Change to range-loop where missing.

Reviwewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37199

llvm-svn: 311993
2017-08-29 10:04:18 +00:00
Diana Picus c9f29c62cc [ARM] GlobalISel: Select globals in PIC mode
Support the selection of G_GLOBAL_VALUE in the PIC relocation model. For
simplicity we use the same pseudoinstructions for both Darwin and ELF:
(MOV|LDRLIT)_ga_pcrel(_ldr).

This is new for ELF, so it requires a small update to the ARM pseudo
expansion pass to make sure it adds the correct constant pool modifier
and add-current-address in the case of ELF.

Differential Revision: https://reviews.llvm.org/D36507

llvm-svn: 311992
2017-08-29 09:47:55 +00:00
Joerg Sonnenberger 0f76a35c5e Fix ARMv4 support
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.

Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.

Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.

Differential Revision: https://reviews.llvm.org/D37030

llvm-svn: 311921
2017-08-28 20:20:47 +00:00
Geoff Berry 75c4ae3066 [ARM] Fix bug in ARMLoadStoreOptimizer when kill flags are missing.
Summary:
ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the
load destination registers to be split overlapped with the base register
if the base register was marked as killed.  Since kill flags may not
always be present, this can lead to incorrect code.

This bug was exposed by my MachineCopyPropagation change D30751 breaking
the sanitizer-x86_64-linux-android buildbot.

Also clean up some dead code and add an assert that a register offset is
never encountered by this code, since it does not handle them correctly.

Reviewers: MatzeB, qcolombet, t.p.northover

Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37164

llvm-svn: 311907
2017-08-28 19:03:45 +00:00
NAKAMURA Takumi a1e97a77f5 Untabify.
llvm-svn: 311875
2017-08-28 06:47:47 +00:00
Javed Absar b81fa9932a [ARM] Tidy-up condition-code support functions
Move condition code support functions to Utils and remove code duplication.

Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37179

llvm-svn: 311860
2017-08-27 20:38:28 +00:00
Javed Absar 17ee7c0977 [ARM] Tidy-up ARMAsmParser. NFC.
Simplify getDRegFromQReg function

Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37118

llvm-svn: 311850
2017-08-27 14:46:57 +00:00
Evgeny Astigeevich 540a39adf7 [ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.

This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106

Differential Revision: https://reviews.llvm.org/D36467

llvm-svn: 311649
2017-08-24 10:00:25 +00:00
Tim Northover 4bafa16748 ARM: use internal relocations for local symbols after all.
Switching to external relocations for ARM-mode branches (to allow Thumb
interworking when the offset is unencodable) causes calls to temporary symbols
to be miscompiled and instead go to the parent externally visible symbol.

Calling a temporary never happens in compiled code, but can occasionally in
hand-written assembly.

llvm-svn: 311611
2017-08-23 22:07:10 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Florian Hahn 214e13d949 [ARM] Add missing patterns for insert_subvector.
Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns.

Reviewers: t.p.northover, olista01, javed.absar, rengolin

Reviewed By: javed.absar

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36796

llvm-svn: 311543
2017-08-23 10:20:59 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Sam Parker 6dc3fcb1c6 [ARM][AArch64] v8.3-A Javascript Conversion
Armv8.3-A adds instructions that convert a double-precision floating
point number to a signed 32-bit integer with round towards zero,
designed for improving Javascript performance.

Differential Revision: https://reviews.llvm.org/D36785

llvm-svn: 311448
2017-08-22 11:08:21 +00:00
Renato Golin f63d701669 [ARM] Call setBooleanContents(ZeroOrOneBooleanContent)
The ARM backend should call setBooleanContents so that it can
use known bits to make some optimizations.

Review: D35821

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311446
2017-08-22 11:02:37 +00:00
Chandler Carruth b866178067 Fix a typo in r311435.
llvm-svn: 311437
2017-08-22 09:20:52 +00:00
Alex Bradbury 080f6976c0 Use report_fatal_error for unsupported calling conventions
The calling convention can be specified by the user in IR. Failing to support 
a particular calling convention isn't a programming error, and so relying on 
llvm_unreachable to catch and report an unsupported calling convention is not 
appropriate.

Differential Revision: https://reviews.llvm.org/D36830

llvm-svn: 311435
2017-08-22 09:11:41 +00:00
Sam Parker b252ffd2cc [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

llvm-svn: 311316
2017-08-21 08:43:06 +00:00
Martin Storsjo d606d2a643 [ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.
Differential Revision: https://reviews.llvm.org/D36930

llvm-svn: 311260
2017-08-19 20:26:51 +00:00
Martin Storsjo 91522ffa12 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

llvm-svn: 311258
2017-08-19 19:47:48 +00:00
Matthias Braun 91bd3ad128 ARMRegsiterInfo: Define more ssub indexes; NFC
This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

llvm-svn: 311218
2017-08-19 01:21:11 +00:00
Tim Northover 14302fcb24 ARM: use an external relocation for calls from MachO ARM mode.
The internal (__text-relative) relocation risks the offset not being encodable
if the destination is Thumb.

llvm-svn: 311187
2017-08-18 19:13:56 +00:00
Sam Parker 04a7db5915 [ARM] Add PostRAScheduler option
This patch adds the option to allow also using the PostRA scheduler,
which brings the ARM backend inline with AArch64 targets. The
SchedModel can also set 'PostRAScheduler', as the R52 does, so also
query this property in the overridden function.

Differential Revision: https://reviews.llvm.org/D36866

llvm-svn: 311162
2017-08-18 14:27:51 +00:00
Saleem Abdulrasool dd8c16b58e ARM: mark CPSR as clobbered for Windows VLAs
When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.

llvm-svn: 311061
2017-08-17 02:42:24 +00:00
Sam Parker 84fd0c3bf2 [ARM] Improve loop unrolling for Cortex-M
- Set the default runtime unroll count to 4 and use the newly added
  UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.

Differential Revision: https://reviews.llvm.org/D36134

llvm-svn: 310997
2017-08-16 07:42:44 +00:00
Quentin Colombet 61d71a138b Reapply "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310425, thus reapplying r310335 with a fix for link
issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON.

Original commit message:
[GlobalISel] Remove the GISelAccessor API.

Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

----
The fix for the link issue consists in adding the GlobalISel library in
the list of dependencies for the AArch64 unittests. This dependency
comes from the use of AArch64Subtarget that needs to know how
to destruct the GISel related APIs when being detroyed.

Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and
understand the problem.

llvm-svn: 310969
2017-08-15 22:31:51 +00:00
Javed Absar 37b2286564 [ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementation
Modernise the code with range-loops etc

Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502

llvm-svn: 310807
2017-08-14 01:38:01 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Florian Hahn a5ba4ee8bc [Triple] Add isThumb and isARM functions.
Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.

Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover

Reviewed By: rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D34682

llvm-svn: 310781
2017-08-12 17:40:18 +00:00
Sjoerd Meijer 7426c97bc6 [ARM] Assembler support for the ARMv8.2a dot product instructions
Commit r310480 added the AArch64 ARMv8.2a dot product instructions;
this adds the AArch32 instructions.

Differential Revision: https://reviews.llvm.org/D36575

llvm-svn: 310701
2017-08-11 09:52:30 +00:00
Eli Friedman e9499dd4c7 [ARM] Clarify legal addressing modes for ARM and Thumb2. NFC
The existing code is very clever, but not clear, which seems
like the wrong tradeoff here.

Differential Revision: https://reviews.llvm.org/D36559

llvm-svn: 310653
2017-08-10 19:31:27 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Sam Parker 9d95764c3b [ARM][AArch64] ARMv8.3-A enablement
The beta ARMv8.3 ISA specifications have been released for AArch64
and AArch32, these can be found at:
https://developer.arm.com/products/architecture/a-profile/exploration-tools

An introduction to this architecture update can be found at:
https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions

This patch is the first in a series which will add ARM v8.3-A support
in LLVM and Clang. It adds the necessary changes that create targets
for both the ARM and AArch64 backends.

Differential Revision: https://reviews.llvm.org/D36514

llvm-svn: 310561
2017-08-10 09:41:00 +00:00
Matthias Braun a88587ce0c ARM: Fix CMP_SWAP expansion
Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP
returning an uninitialized status value.

- I was always using tMOVi8 to zero the status register which cannot
  encode higher register numbers and llvm would silently miscompile)

- Nobody was ever looking at that status value outside the expansion.
  ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP
  instructions was not mapping anything to it. (The cmpxchg status value
  from llvm IR is lowered to a manual comparison after the CMP_SWAP)

So this:
- Renames the register from "status" to "temp" it make it obvious that
  it isn't used outside the expansion.
- Remove the zeroing status/temp register.
- Keep the live-in list improvements from r304267

Fixes http://llvm.org/PR34056

llvm-svn: 310534
2017-08-09 22:22:05 +00:00
Florian Hahn d68bc7ae8d [ARM] Emit error when ARM exec mode is not available.
Summary:
A similar error message has been removed from the ARMTargetMachineBase
constructor in r306939. With this patch, we generate an error message
for the example below, compiled with -mcpu=cortex-m0, which does not
have ARM execution mode.

    __attribute__((target("arm"))) int foo(int a, int b)
    {
        return a + b % a;
    }

    __attribute__((target("thumb"))) int bar(int a, int b)
    {
        return a + b % a;
    }

By adding this error message to ARMBaseTargetMachine::getSubtargetImpl,
we can deal with functions that set -thumb-mode in target-features.
At the moment it seems like Clang does not have access to target-feature
specific information, so adding the error message to the frontend will
be harder.

Reviewers: echristo, richard.barton.arm, t.p.northover, rengolin, efriedma

Reviewed By: echristo, efriedma

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D35627

llvm-svn: 310486
2017-08-09 15:39:10 +00:00
Florian Hahn fc4b3951e9 [ARM] Remove FeatureNoARM implies ModeThumb.
Summary:
By removing FeatureNoARM implies ModeThumb, we can detect cases where a
function's target-features contain -thumb-mode (enables ARM codegen for the
function), but the architecture does not support ARM mode. Previously, the
implication caused the FeatureNoARM bit to be cleared for functions with
-thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1]
pointless for such functions.

This assertion is the only guard against generating ARM code for
architectures without ARM codegen support. Is there a place where we
could easily generate error messages for the user? At the moment, we
would generate ARM code for Thumb-only architectures. X86 has the same
behavior as ARM, as in it only has an assertion and no error message,
but I think for ARM an error message would be helpful. What do you
think?

For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will
generate ARM assembler (or fail with an assertion error with this patch).
Note that if we run the resulting assembler through llvm-mc, we get
an appropriate error message, but not when codegen is handled
through clang.

```
define void @bar() #0 {
entry:
  ret void
}

attributes #0 = { "target-features"="-thumb-mode" }
```

[1] c1f7b54cef/lib/Target/ARM/ARMSubtarget.cpp (L147)

Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo

Reviewed By: rengolin, echristo

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35569

llvm-svn: 310476
2017-08-09 13:53:28 +00:00
Quentin Colombet 8dd90fb54b Revert "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310115.

It causes a linker failure for the one of the unittests of AArch64 on one
of the linux bot:
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429

: && /home/fedora/gcc/install/gcc-7.1.0/bin/g++   -fPIC
-fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W
-Wno-unused-parameter -Wwrite-strings -Wcast-qual
-Wno-missing-field-initializers -pedantic -Wno-long-long
-Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment
-ffunction-sections -fdata-sections -O2
-L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined
-Wl,-O3 -Wl,--gc-sections
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o  -o
unittests/Target/AArch64/AArch64Tests
lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn
lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn
lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn
lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn
lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread
lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread
-Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib
&& :
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0):
undefined reference to `vtable for llvm::LegalizerInfo'
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8):
undefined reference to `vtable for llvm::RegisterBankInfo'

The particularity of this bot is that it is built with
BUILD_SHARED_LIBS=ON

However, I was not able to reproduce the problem so far.
Reverting to unblock the bot.

llvm-svn: 310425
2017-08-08 22:22:30 +00:00
Tim Northover f370f2e3c6 Revert "[ARM] Fix assembly and disassembly for VMRS/VMSR"
This reverts r310243. Only MVFR2 is actually restricted to v8 and it'll be a
little while before we can get a proper fix together. Better that we allow
incorrect code than reject correct in the meantime.

llvm-svn: 310384
2017-08-08 17:16:46 +00:00
Andre Vieira 7dffb9bfa6 [ARM] Fix assembly and disassembly for VMRS/VMSR
This patch addresses two issues with assembly and disassembly for VMRS/VMSR:

1.currently VMRS/VMSR instructions accessing fpsid, mvfr{0-2} and fpexc, are
  accepted for non ARMv8-A targets.

2. all VMRS/VMSR instructions accept writing/reading to PC and SP, when only
   ARMv7-A and ARMv8-A should be allowed to write/read to SP and none to PC.

This patch addresses those issues and adds tests for these cases.

Differential Revision: https://reviews.llvm.org/D36306

llvm-svn: 310243
2017-08-07 08:41:05 +00:00
Florian Hahn d51a35e339 [ARM] The ARM backend is MachineVerifier clean now.
Summary: Thanks everyone involved in fixing the outstanding issues.

Reviewers: rovka, MatzeB, efriedma

Reviewed By: MatzeB

Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36153

llvm-svn: 310180
2017-08-05 15:14:06 +00:00
Quentin Colombet c046208c52 [GlobalISel] Remove the GISelAccessor API.
Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

llvm-svn: 310115
2017-08-04 20:15:46 +00:00
Javed Absar 9cda599151 [ARM] Use searchable-table for banked registers
This is a continuation of https://reviews.llvm.org/D36219

This patch uses reverse mapping (encoding->name) in
ARMInstPrinter::printBankedRegOperand to get rid of
hard-coded values (as pointed out by @olista01).

Reviewed by: @fhahn, @rovka, @olista01
Differential Revision: https://reviews.llvm.org/D36260

llvm-svn: 310072
2017-08-04 17:10:11 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Nico Weber bfde70b097 Revert r309923, it caused PR34045.
llvm-svn: 309950
2017-08-03 15:41:26 +00:00
Diana Picus 930e6ec8f3 [ARM] GlobalISel: Select simple G_GLOBAL_VALUE instructions
Add support in the instruction selector for G_GLOBAL_VALUE for ELF and
MachO for the static relocation model. We don't handle Windows yet
because that's Thumb-only, and we don't handle Thumb in general at the
moment.

Support for PIC, ROPI, RWPI and TLS will be added in subsequent commits.

Differential Revision: https://reviews.llvm.org/D35883

llvm-svn: 309927
2017-08-03 09:14:59 +00:00
Roger Ferrer Ibanez 854980341b [ARM] Use ADDCARRY / SUBCARRY
This patch:

- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
  using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value
  using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
  operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
  borrow, we need to invert the value of the second operand and result before
  and after using ARMISD::SUBE. We need to invert the carry result of
  ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
  ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering
  as well otherwise i64 operations now would require branches. This implies
  updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
  to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 309923
2017-08-03 07:45:10 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Javed Absar 054d1aef43 [ARM] Tidy up banked registers encoding
Moves encoding (SYSm) information of banked registers to ARMSystemRegister.td,
where it rightly belongs and forms a single point of reference in the code.

Reviewed by: @fhahn, @rovka, @olista01
Differential Revision: https://reviews.llvm.org/D36219

llvm-svn: 309910
2017-08-03 01:24:12 +00:00
Strahinja Petrovic 25e9e1b866 [ARM] Add the option to directly access TLS pointer
This patch enables choice for accessing thread local
storage pointer (like '-mtp' in gcc).

Differential Revision: https://reviews.llvm.org/D34408

llvm-svn: 309381
2017-07-28 12:54:57 +00:00
Matthias Braun c618a466f1 ARMFrameLowering: Only set ExtraCSSpill for actually unused registers.
The code assumed that unclobbered/unspilled callee saved registers are
unused in the function. This is not true for callee saved registers that are
also used to pass parameters such as swiftself.

rdar://33401922

llvm-svn: 309350
2017-07-28 01:36:32 +00:00
Florian Hahn e3583bdf91 [ARM] Add use-misched feature, to enable the MachineScheduler.
Summary:
This change makes it easier to experiment with the MachineScheduler in
the ARM backend and also makes it very explicit which CPUs use the
MachineScheduler (currently only swift and cyclone).



Reviewers: MatzeB, t.p.northover, javed.absar

Reviewed By: MatzeB

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35935

llvm-svn: 309316
2017-07-27 19:56:44 +00:00
Florian Hahn 67ddd1d08f [TargetParser] Use enum classes for various ARM kind enums.
Summary:
Using c++11 enum classes ensures that only valid enum values are used
for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the
need for checks that the provided values map to a proper enum value,
allows us to get rid of AK_LAST and prevents comparing values from
different enums. It also removes a bunch of static_cast
from unsigned to enum values and vice versa, at the cost of introducing
static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind.

FPUKind and ArchExtKind are the only remaining old-style enum in
TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style
enum, but FPUKind can be converted too, but this patch is quite big, so
could do this in a follow-up patch. I could also split this patch up a
bit, if people would prefer that.

Reviewers: rengolin, javed.absar, chandlerc, rovka

Reviewed By: rovka

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35882

llvm-svn: 309287
2017-07-27 16:27:56 +00:00
Florian Hahn db479524dd [ARM] Mark labels in skipAlignedDPRCS2Spills as fallthrough (NFC).
The comment at the top of the switch statement indicates that the
fall-through behavior is intentional. By using LLVM_FALLTHROUGH,
-Wimplicit-fallthrough are silenced, which is enabled by default in GCC
7.

llvm-svn: 309272
2017-07-27 14:37:17 +00:00
Daniel Sanders 8e82af2be6 Re-commit: r309094 [globalisel][tablegen] Fuse the generated tables together.
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.

This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().

The compile-time regressions from the last commit were caused by the ARM target
including a non-const variable (zero_reg) in the table and therefore generating
an initializer for it. That variable is now const.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35681

llvm-svn: 309264
2017-07-27 11:03:45 +00:00
Evandro Menezes b3ed4bcb8f [ARM] Minor cosmetic edits (NFC)
Change the order of a case and the description for Exynos Mx processors.

llvm-svn: 309184
2017-07-26 21:28:20 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Rafael Espindola e06e4df8be Simplify. NFC.
llvm-svn: 309141
2017-07-26 17:27:27 +00:00
Diana Picus a5d6518e93 [ARM] GlobalISel: Map G_GLOBAL_VALUE to GPR
A G_GLOBAL_VALUE is basically a pointer, so it should live in the GPR.

llvm-svn: 309101
2017-07-26 11:01:13 +00:00
Diana Picus b1fd784936 [ARM] GlobalISel: Mark G_GLOBAL_VALUE as legal
llvm-svn: 309090
2017-07-26 09:25:15 +00:00
Zvi Rackover 1b73682243 TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI.
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.

This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.

llvm-svn: 309085
2017-07-26 08:06:58 +00:00
Eric Christopher 97ae58686f Update the comments on default subtargets based on feedback.
llvm-svn: 309041
2017-07-25 22:21:08 +00:00
Sam Parker 19a08e42a8 [ARM] Enable partial and runtime unrolling
Enable runtime and partial loop unrolling of simple loops without
calls on M-class cores. The thresholds are calculated based on
whether the target is Thumb or Thumb-2.

Differential Revision: https://reviews.llvm.org/D34619

llvm-svn: 308956
2017-07-25 08:51:30 +00:00
Erich Keane d8f61f8f7e Remove Bitrig: LLVM Changes
Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned.

Differential Revision: https://reviews.llvm.org/D35707

llvm-svn: 308799
2017-07-21 22:48:47 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Javed Absar e9599e39fe [ARM] Simplify ExpandPseudoInst. NFC.
Remove headers not required and convert to range-loop

Reviewed by: @mcrosier
Differential Revision: https://reviews.llvm.org/D35626

llvm-svn: 308607
2017-07-20 12:35:37 +00:00
Javed Absar 2cb0c95031 [ARM] Unify handling of M-Class system registers
This patch cleans up and fixes issues in the M-Class system register handling:

1. It defines the system registers and the encoding (SYSm values) in one place:
   a new ARMSystemRegister.td using SearchableTable, thereby removing the
   hand-coded values which existed in multiple places.

2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed!
   Ref: ARMv6/7/8M architecture reference manual.

Reviewed by: @t.p.northover, @olist01, @john.brawn
Differential Revision: https://reviews.llvm.org/D35209

llvm-svn: 308456
2017-07-19 12:57:16 +00:00
Javed Absar 5b8e487b47 [ARM|CodeGen] Improve the code in FastISel
Cleaned up the code in FastISel a bit.
Had to add make_range to MCInstrDesc as that was needed and seems missing.

Reviewed by: @t.p.northover
Differential Revision: https://reviews.llvm.org/D35494

llvm-svn: 308291
2017-07-18 10:19:48 +00:00
Diana Picus da25d5b8b0 [ARM] GlobalISel: Support G_(S|U)REM for s8 and s16
Widen to s32, and then do whatever Lowering/Custom/Libcall action the
subtarget wants.

llvm-svn: 308285
2017-07-18 10:07:01 +00:00
Javed Absar dd2c29ef83 [CodeGen] Add begin-end iterators to MachineInstr
Convert iteration over operands to range-loop.

Reviewed by: @rovka, @echristo
Differential Revision: https://reviews.llvm.org/D35419

llvm-svn: 308173
2017-07-17 13:15:26 +00:00
Hiroshi Inoue a9ee279e70 fix typos in comments; NFC
llvm-svn: 308126
2017-07-16 07:48:48 +00:00
Diana Picus 87a7067983 [ARM] GlobalISel: Support G_BRCOND
Insert a TSTri to set the flags and a Bcc to branch based on their
values. This is a bit inefficient in the (common) cases where the
condition for the branch comes from a compare right before the branch,
since we set the flags both as part of the compare lowering and as part
of the branch lowering. We're going to live with that until we settle on
a principled way to handle this kind of situation, which occurs with
other patterns as well (combines might be the way forward here).

llvm-svn: 308009
2017-07-14 09:46:06 +00:00
Sam Parker 2893448576 [ARM] Allow rematerialization of ARM Thumb literal pool loads
Constants are crucial for code size in the ARM Thumb-1 instruction
set. The 16 bit instruction size often does not offer enough space
for immediate arguments. This means that additional instructions are
frequently used to load constants into registers. Since constants are
hoisted, this can lead to significant register spillage if they are
used multiple times in a single function. This can be avoided by
rematerialization, i.e. recomputing a constant instead of reloading
it from the stack. This patch fixes the rematerialization of literal
pool loads in the ARM Thumb instruction set.

Patch by Philip Ginsbach

Differential Revision: https://reviews.llvm.org/D33936

llvm-svn: 308004
2017-07-14 08:23:56 +00:00
Eric Christopher 4e332c7cf1 Add a set of comments explaining why getSubtargetImpl() is deleted on these targets.
llvm-svn: 307999
2017-07-14 04:33:43 +00:00
Diana Picus c452175642 [ARM] GlobalISel: Support G_BR
This boils down to not crashing in reg bank select due to the lack of
register operands on this instruction, and adding some tests. The
instruction selection is already covered by the TableGen'erated code.

llvm-svn: 307904
2017-07-13 11:09:34 +00:00
Javed Absar d32f9c8190 [ARM] Tidy up and organise better ARM.td. NFC.
This patch tidies up and organises ARM.td
so that it is easier to understandand
and extend in the future.

Reviewed by: @hahn, @rovka
Differential Revision: https://reviews.llvm.org/D35248

llvm-svn: 307897
2017-07-13 10:24:30 +00:00
Diana Picus f69d7b0495 Fixup r307893: Silence warning
Silence unused variable warning in release builds.
*sigh*

llvm-svn: 307896
2017-07-13 09:52:06 +00:00
Diana Picus 6860a60c07 [ARM] GlobalISel: Move local variable. NFC
Move a local variable from outside a switch to inside every case that
needs it (which isn't all of the cases, of course).

llvm-svn: 307893
2017-07-13 09:30:08 +00:00
Florian Hahn 4adcfcf1d6 [ARM] Inline callee if its target-features are a subset of the caller
Summary:
Similar to X86, it should be safe to inline callees if their
target-features are a subset of the caller. As some subtarget features
provide different instructions depending on whether they are set or
unset (e.g. ThumbMode and ModeSoftFloat), we use a whitelist of
target-features describing hardware capabilities only.

Reviewers: kristof.beyls, rengolin, t.p.northover, SjoerdMeijer, peter.smith, silviu.baranga, efriedma

Reviewed By: SjoerdMeijer, efriedma

Subscribers: dschuff, efriedma, aemerson, sdardis, javed.absar, arichardson, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D34697

llvm-svn: 307889
2017-07-13 08:26:17 +00:00
John Brawn 97cc283117 [ARM] Adjust ifcvt heuristic for the diamond ifcvt case
When we have a diamond ifcvt the fallthough block will have a branch at the end
of it that disappears when predicated, so discount it from the predication cost.

Differential Revision: https://reviews.llvm.org/D34952

llvm-svn: 307788
2017-07-12 13:23:10 +00:00
Diana Picus 995746da03 [ARM] GlobalISel: Simplify inst selector code. NFC
Refactor CmpHelper into something simpler. It was overkill to use
templates for this - instead, use a simple CmpConstants structure to
hold the opcodes and other constants that are different when selecting
int / float / double comparisons. Also, extract some of the helpers that
were in CmpHelper into ARMInstructionSelector and make use of some of
them when selecting other things than just compares.

llvm-svn: 307766
2017-07-12 10:31:16 +00:00
Diana Picus 21014df5e0 [ARM] GlobalISel: Select s64 G_FCMP
Very similar to how we select s32 G_FCMP, the only thing that is
different is the exact opcodes that we use.

llvm-svn: 307763
2017-07-12 09:01:54 +00:00
Rafael Espindola 1beb702ba2 Fully fix the movw/movt addend.
The issue is not if the value is pcrel. It is whether we have a
relocation or not.

If we have a relocation, the static linker will select the upper
bits. If we don't have a relocation, we have to do it.

llvm-svn: 307730
2017-07-11 23:18:25 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Martin Storsjo 0e83e85f63 [ARM, ELF] Don't shift movt relocation offsets
For ELF, a movw+movt pair is handled as two separate relocations.
If an offset should be applied to the symbol address, this offset is
stored as an immediate in the instruction (as opposed to stored as an
offset in the relocation itself).

Even though the actual value stored in the movt immediate after linking
is the top half of the value, we need to store the unshifted offset
prior to linking. When the relocation is made during linking, the offset
gets added to the target symbol value, and the upper half of the value
is stored in the instruction.

This makes sure that movw+movt with offset symbols get properly
handled, in case the offset addition in the lower half should be
carried over to the upper half.

This makes the output from the additions to the test case match
the output from GNU binutils.

For COFF and MachO, the movw/movt relocations are handled as a pair,
and the overflow from the lower half gets carried over to the movt,
so they should keep the shifted offset just as before.

Differential Revision: https://reviews.llvm.org/D35242

llvm-svn: 307713
2017-07-11 21:07:10 +00:00
Diana Picus 069da27f49 [ARM] GlobalISel: Add reg mapping for s64 G_FCMP
Map the result into GPR and the operands into FPR.

llvm-svn: 307653
2017-07-11 11:47:45 +00:00
Peter Smith a2e5ecc1f3 [ARM] ldr pc,=expression should be allowed in Thumb2
This change allows the pc to be used as a destination register for the
pseudo instruction LDR pc,=expression . The pseudo instruction must not be
transformed into a MOV, but it can use the Thumb2 LDR (literal) instruction
to a constant pool entry. See (A7.7.43 from ARMv7M ARM ARM).

Differential Revision: https://reviews.llvm.org/D34751

llvm-svn: 307640
2017-07-11 09:47:12 +00:00
Diana Picus 443135c6eb [ARM] GlobalISel: Fix oversight in G_FCMP legalization
We used to forget to erase the original instruction when replacing a
G_FCMP true/false. Fix this bug and make sure the tests check for it.

llvm-svn: 307639
2017-07-11 09:43:51 +00:00
Diana Picus b57bba8316 [ARM] GlobalISel: Legalize s64 G_FCMP
Same as the s32 version, for both hard and soft float.

llvm-svn: 307633
2017-07-11 08:50:01 +00:00
Nirav Dave 4dcad5dc6b Add DAG argument to canMergeStoresTo NFC.
llvm-svn: 307583
2017-07-10 20:25:54 +00:00
Javed Absar fb3210aa05 [ARM] Tidy up ARMBaseRegisterInfo implementation. NFC
Clean up ARMBaseRegisterInfo implementation a bit.
Differential Revision: https://reviews.llvm.org/D35116

llvm-svn: 307531
2017-07-10 10:42:55 +00:00
Simon Pilgrim e2d84d953e [ARM] Fix -Wimplicit-fallthrough warning. NFCI.
llvm-svn: 307480
2017-07-08 18:42:04 +00:00
Simon Pilgrim cb07d67a5c Fix some more -Wimplicit-fallthrough warnings. NFCI.
llvm-svn: 307411
2017-07-07 16:40:06 +00:00
Matthew Simpson 12eaef75ce [ARM] Implement interleaved access bug fix from r306334
r306334 fixed a bug in AArch64 dealing with wide interleaved accesses having
pointer types. The bug also exists in ARM, so this patch copies over the fix.

llvm-svn: 307409
2017-07-07 16:15:05 +00:00
Simon Pilgrim ce1fb22c6a [Arm] Fix -Wimplicit-fallthrough warnings. NFCI.
llvm-svn: 307375
2017-07-07 10:05:45 +00:00
Diana Picus 77367378ac [ARM] GlobalISel: Fixup r307365
Rename member DebugLoc -> DbgLoc (so it doesn't conflict with the class
name).

llvm-svn: 307366
2017-07-07 08:53:27 +00:00
Diana Picus 5b91653840 [ARM] GlobalISel: Select hard G_FCMP for s32
We lower to a sequence consisting of:
- MOVi 0 into a register
- VCMPS to do the actual comparison and set the VFP flags
- FMSTAT to move the flags out of the VFP unit
- MOVCCi to either use the "zero register" that we have previously set
  with the MOVi, or move 1 into the result register, based on the values
  of the flags

As was the case with soft-float, for some predicates (one, ueq) we
actually need two comparisons instead of just one. When that happens, we
generate two VCMPS-FMSTAT-MOVCCi sequences and chain them by means of
using the result of the first MOVCCi as the "zero register" for the
second one. This is a bit overkill, since one comparison followed by
two non-flag-setting conditional moves should be enough. In any case,
the backend manages to CSE one of the comparisons away so it doesn't
matter much.

Note that unlike SelectionDAG and FastISel, we always use VCMPS, and not
VCMPES. This makes the code a lot simpler, and it also seems correct
since the LLVM Lang Ref defines simple true/false returns if the
operands are QNaN's. For SNaN's, even VCMPS throws an Invalid Operand
exception, so they won't be slipping through unnoticed.

Implementation-wise, this introduces a template so we can share the same
code that we use for handling integer comparisons, since the only
differences are in the details (exact opcodes to be used etc). Hopefully
this will be easy to extend to s64 G_FCMP.

llvm-svn: 307365
2017-07-07 08:39:04 +00:00
Diana Picus c3a9c34761 [ARM] GlobalISel: Map s32 G_FCMP in reg bank select
Map hard G_FCMP operands to FPR and the result to GPR.

llvm-svn: 307245
2017-07-06 09:57:46 +00:00
Diana Picus d0104eaae8 [ARM] GlobalISel: Legalize G_FCMP for s32
This covers both hard and soft float.

Hard float is easy, since it's just Legal.

Soft float is more involved, because there are several different ways to
handle it based on the predicate: one and ueq need not only one, but two
libcalls to get a result. Furthermore, we have large differences between
the values returned by the AEABI and GNU functions.

AEABI functions return a nice 1 or 0 representing true and respectively
false. GNU functions generally return a value that needs to be compared
against 0 (e.g. for ogt, the value returned by the libcall is > 0 for
true).  We could introduce redundant comparisons for AEABI as well, but
they don't seem easy to remove afterwards, so we do different processing
based on whether or not the result really needs to be compared against
something (and just truncate if it doesn't).

llvm-svn: 307243
2017-07-06 09:09:33 +00:00
Diana Picus cd460c89c4 [ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANT
Get the legalizer to widen small constants.

llvm-svn: 307239
2017-07-06 08:04:16 +00:00
Diana Picus fc1675eb16 [GlobalISel] Refactor Legalizer helpers for libcalls
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.

llvm-svn: 307149
2017-07-05 12:57:24 +00:00
Sjoerd Meijer 6d14fdf62d [AsmParser] Mnemonic Spell Corrector
This implements suggesting other mnemonics when an invalid one is specified,
for example:

$ echo "adXd r1,r2,#3" | llvm-mc -triple arm
<stdin>:1:1: error: invalid instruction, did you mean: add, qadd?
adXd r1,r2,#3
^

The implementation is target agnostic, but as a first step I have added it only
to the ARM backend; so the ARM backend is a good example if someone wants to
enable this too for another target.

Differential Revision: https://reviews.llvm.org/D33128

llvm-svn: 307148
2017-07-05 12:39:13 +00:00
Diana Picus 3e8851a1b4 [ARM] GlobalISel: Extract tiny helper. NFC
Extract functionality for determining if the target uses AEABI.

llvm-svn: 307145
2017-07-05 11:53:51 +00:00
Hiroshi Inoue 79f8933f23 fix trivial typos in comments; NFC
llvm-svn: 307094
2017-07-04 16:35:26 +00:00
Daniel Sanders 6ab0daade8 [globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.

The following patches will expand on this further to fully fix the regressions.

Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33758

llvm-svn: 307079
2017-07-04 14:35:06 +00:00
Eric Christopher 3df231a1f7 Remove the default ARMSubtarget from the ARM TargetMachine.
This enables us to ensure better LTO and code generation in the face of module linking.
Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing.

llvm-svn: 306939
2017-07-01 03:41:53 +00:00
Eric Christopher 015dc2094e Rewrite ARM execute only support to avoid the use of a command line flag and unqualified ARMSubtarget lookup.
Paired with a clang commit to use the new behavior.

llvm-svn: 306927
2017-07-01 02:55:22 +00:00
Quentin Colombet 51b7af3e14 [ARM] Move GISel accessor initialization from TargetMachine to Subtarget.
NFC

llvm-svn: 306920
2017-07-01 00:45:45 +00:00
Rafael Espindola 76287ab3a0 Rename and adjust processFixupValue.
It was not processing any value. All that it ever did was force
relocations, so name it shouldForceRelocation.

llvm-svn: 306906
2017-06-30 22:47:27 +00:00
Tim Northover 2b5f03aa12 ARM: fix big-endian 64-bit cmpxchg.
On big-endian machines the high and low parts of the value accessed by ldrexd
and strexd are swapped around. To account for this we swap inputs and outputs
in ISelLowering.

Patch by Bharathi Seshadri.

llvm-svn: 306865
2017-06-30 19:51:02 +00:00
Kristof Beyls b539ea5393 [GlobalISel] Make multi-step legalization work.
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.

This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.

Differential Revision: https://reviews.llvm.org/D32529

llvm-svn: 306806
2017-06-30 08:26:20 +00:00
Eric Christopher ee837a59f7 Unified logic for computing target ABI in backend and front end by moving this common code to Support/TargetParser.
Modeled Triple::GNU after front end code (aapcs abi) and  updated tests that expect apcs abi.

Based heavily on a patch by Ana Pazos!

llvm-svn: 306768
2017-06-30 00:03:54 +00:00
Eugene Leviant 6269d39f44 [llvm-objdump] Handle invalid instruction gracefully on ARM
Differential revision: https://reviews.llvm.org/D34813

llvm-svn: 306687
2017-06-29 15:38:47 +00:00
Florian Hahn 08fdd040b5 [ARM] Add tGPRwithpc register class and use it for TBB/THH
Summary:
TBB and THH allow using a Thumb GPR or the PC as destination operand.
A few machine verifier failures where due to those instructions not
expecting PC as destination operand.

Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add
test coverage even if expensive checks are disabled.



Reviewers: MatzeB, t.p.northover, jmolloy

Reviewed By: MatzeB

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34610

llvm-svn: 306654
2017-06-29 08:45:31 +00:00
Rafael Espindola 920fa14011 Don't repeat names and reformat. NFC.
llvm-svn: 306556
2017-06-28 16:00:16 +00:00
John Brawn 75d76e5e95 [ARM] Improve if-conversion for M-class CPUs without branch predictors
The current heuristic in isProfitableToIfCvt assumes we have a branch predictor,
and so gives the wrong answer in some cases when we don't. This patch adds a
subtarget feature to indicate that a subtarget has no branch predictor, and
changes the heuristic in isProfitableToiIfCvt when it's present. This gives a
slight overall improvement in a set of embedded benchmarks on Cortex-M4 and
Cortex-M33.

Differential Revision: https://reviews.llvm.org/D34398

llvm-svn: 306547
2017-06-28 14:11:15 +00:00
Kristof Beyls eecb353d0e [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.

As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
 instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
 instruction schedule and/or the estimated cost of a branch mispredict.

llvm-svn: 306514
2017-06-28 07:07:03 +00:00
Diana Picus 0e74a134f8 [ARM] GlobalISel: Support G_SELECT for pointers
All we need to do is mark it as legal, otherwise it's just like s32.

llvm-svn: 306390
2017-06-27 10:29:50 +00:00
Diana Picus 7145d22f81 [ARM] GlobalISel: Support G_SELECT for i32
* Mark as legal for (s32, i1, s32, s32)
* Map everything into GPRs
* Select to two instructions: a CMP of the condition against 0, to set
  the flags, and a MOVCCr to select between the two inputs based on the
  flags that we've just set

llvm-svn: 306382
2017-06-27 09:19:51 +00:00
Rafael Espindola 6418856127 Simplify the processFixupValue interface. NFC.
llvm-svn: 306202
2017-06-24 05:22:28 +00:00
Rafael Espindola f351292141 Remove redundant argument.
llvm-svn: 306189
2017-06-24 00:26:57 +00:00
Rafael Espindola 801b42de31 ARM: move some logic from processFixupValue to applyFixup.
processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.

While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.

llvm-svn: 306177
2017-06-23 22:52:36 +00:00
Rafael Espindola 58173b9720 COFF: Produce an error on invalid pcrel relocs.
X86_64 COFF only has support for 32 bit pcrel relocations. Produce an
error on all others.

Note that gnu as has extended the relocation values to support
this. It is not clear if we should support the gnu extension.

llvm-svn: 306082
2017-06-23 04:07:44 +00:00
Florian Hahn 5991b5be74 [ARM] Create relocations for beq.w branches to ARM function syms.
Summary:
The ARM ELF ABI requires the linker to do interworking for wide
conditional branches from Thumb code to ARM code. 

That was pointed out by @peter.smith in the comments for D33436.

Reviewers: rafael, peter.smith, echristo

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith

Differential Revision: https://reviews.llvm.org/D34447

llvm-svn: 306009
2017-06-22 15:32:41 +00:00
Kristof Beyls 9665249fd8 Don't conditionalize Neon instructions, even in IT blocks.
This has been deprecated since ARMARM v7-AR, release C.b, published back
in 2012.

This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was
introduced to check that conditionalization of Neon instructions did
happen when generating Thumb2. However, the test had evolved and was no
longer testing that. Rather than trying to adapt that test, this commit
introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can
now use the MIR framework to write nicer/more maintainable tests.

llvm-svn: 305998
2017-06-22 12:11:38 +00:00
John Brawn ed78aaf093 [ARM] Add .w aliases of MOV with shifted operand
These appear to have been simply missing.

Differential Revision: https://reviews.llvm.org/D34461

llvm-svn: 305993
2017-06-22 10:30:53 +00:00
John Brawn 192f74a84d [ARM] Clean up choice of narrow instructions in ARMAsmParser, NFC
This patch makes a couple of changes to how we decide whether to use the narrow
or wide encoding of thumb2 instructions:
 * Common out the detection of the .w qualifier
 * Check for the CPSR operand in a consistent way

Differential Revision: https://reviews.llvm.org/D34460

llvm-svn: 305992
2017-06-22 10:29:31 +00:00
Florian Hahn b489e56ae2 [ARM] Add macro fusion for AES instructions.
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.

Reviewers: evandro, javed.absar, rengolin, t.p.northover

Reviewed By: javed.absar

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34142

llvm-svn: 305988
2017-06-22 09:39:36 +00:00
Rafael Espindola 88d9e37ec8 Use a MutableArrayRef. NFC.
llvm-svn: 305968
2017-06-21 23:06:53 +00:00
Alexandros Lamprineas 2b2b420563 [ARM] Support constant pools in data when generating execute-only code.
Resubmission of r305387, which was reverted at r305390. The Address
Sanitizer caught a stack-use-after-scope of a Twine variable. This
is now fixed by passing the Twine directly as a function parameter.

The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305776
2017-06-20 07:20:52 +00:00
Diana Picus 78aaf7db04 [ARM] GlobalISel: Support G_ICMP for s8 and s16
Widen to s32 (like all other binary ops).

llvm-svn: 305683
2017-06-19 11:47:28 +00:00
Diana Picus 621894ac76 [ARM] GlobalISel: Support G_ICMP for i32 and pointers
Add support throughout the pipeline:
- mark as legal for s32 and pointers
- map to GPRs
- lower to a sequence of instructions, which moves 0 or 1 into the
  result register based on the flags set by a CMPrr

We have copied from FastISel a helper function which maps CmpInst
predicates into ARMCC codes. Ideally, we should be able to move it
somewhere that both FastISel and GlobalISel can use.

llvm-svn: 305672
2017-06-19 09:40:51 +00:00
Diana Picus 02e11010b2 [ARM] GlobalISel: Add support for i32 modulo
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.

The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.

llvm-svn: 305459
2017-06-15 10:53:31 +00:00
Diana Picus 8fd1601d32 [ARM] GlobalISel: Lower only homogeneous struct args
Lowering mixed struct args, params and returns used G_INSERT, which is a
bit more convoluted to support through the entire pipeline. Since they
don't occur that often in practice, it's probably wiser to leave them
out until later.

Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which
has good support in the legalizer. These occur e.g. as the return of
__aeabi_idivmod, so it's nice to be able to support them.

llvm-svn: 305458
2017-06-15 09:42:02 +00:00
Alexandros Lamprineas 1c15ee2631 Revert "[ARM] Support constant pools in data when generating execute-only code."
This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0.

I've upset a buildbot which runs the address sanitizer:
ERROR: AddressSanitizer: stack-use-after-scope
lib/Target/ARM/ARMISelLowering.cpp:2690
That Twine variable is used illegally.

llvm-svn: 305390
2017-06-14 15:00:08 +00:00
Alexandros Lamprineas c582d6e133 [ARM] Support constant pools in data when generating execute-only code.
The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305387
2017-06-14 13:22:41 +00:00
Oliver Stannard 852fbd2fea [ARM] Add scheduling classes for VFNM[AS]
The VFNM[AS] instructions did not have scheduling information attached, which
was causing assertion failures with the Cortex-A57 scheduling model and
-fp-contract=fast, because the Cortex-A57 sched model claims to be complete.

Differential Revision: https://reviews.llvm.org/D34139

llvm-svn: 305288
2017-06-13 13:04:32 +00:00
Daniel Neilson c0112ae8da Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.

Reviewers: chandlerc, rnk, reames

Reviewed By: reames

Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D33903

llvm-svn: 305189
2017-06-12 14:22:21 +00:00
Javed Absar 9e1ff8654f [ARM] Custom machine-scheduler. NFCI.
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka. 
Differential Revision: https://reviews.llvm.org/D34039

llvm-svn: 305078
2017-06-09 14:07:21 +00:00
Oliver Stannard ad0973557c [ARM] Add scheduling info for VFMS
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

llvm-svn: 305064
2017-06-09 09:19:09 +00:00
Florian Hahn 28a61d64e2 [ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).
llvm-svn: 304905
2017-06-07 12:58:08 +00:00
Diana Picus 0b4190a9d6 [ARM] GlobalISel: Purge G_SEQUENCE
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.

This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.

There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.

llvm-svn: 304902
2017-06-07 12:35:05 +00:00
Diana Picus 0196427b03 [ARM] GlobalISel: Support G_XOR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code

llvm-svn: 304898
2017-06-07 11:57:30 +00:00
Diana Picus eeb0aad8e4 [ARM] GlobalISel: Support G_OR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code

llvm-svn: 304890
2017-06-07 10:14:23 +00:00
Diana Picus 8445858a93 [ARM] GlobalISel: Support G_AND
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)

llvm-svn: 304885
2017-06-07 09:17:41 +00:00
Florian Hahn 9afd9d9254 [ARM] Create relocations for unconditional branches.
Summary:
Relocations are required for unconditional branches to function symbols with
different execution mode. Without this patch, incorrect branches are
generated for tail calls between functions with different execution
mode.


Reviewers: peter.smith, rafael, echristo, kristof.beyls

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33898

llvm-svn: 304882
2017-06-07 08:54:47 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Eugene Zelenko fb69e66cff [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304839
2017-06-06 22:22:41 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Peter Smith d16c55de6d [ARM] Add curly braces around switch case [NFC]
My previous commit r304702 introduced a new case into a switch statement.
This case defined a variable but I forgot to add the curly brackets around the
case to limit the scope.

This change puts the curly braces back in so that the next person that adds a
case doesn't get a build failure. Thanks to avieira for the spot.

Differential Revision: https://reviews.llvm.org/D33931

llvm-svn: 304785
2017-06-06 10:22:49 +00:00
Diana Picus 0091cc3528 [ARM] GlobalISel: Constrain callee register on indirect calls
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.

Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.

Fixes (part of) PR32146.

llvm-svn: 304712
2017-06-05 12:54:53 +00:00
Peter Smith adde667007 [ARM] Support fixup for Thumb2 modified immediate
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
 sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
    
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
    
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
    
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
    
Fixes PR28647

Differential Revision: https://reviews.llvm.org/D33492

llvm-svn: 304702
2017-06-05 09:37:12 +00:00
Diana Picus e7aa90987d [ARM] GlobalISel: Support struct params/returns
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.

The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.

llvm-svn: 304536
2017-06-02 10:16:48 +00:00
Javed Absar 4ae7e81233 [ARM] Cortex-A57 scheduling model for ARM backend (AArch32)
This patch implements the Cortex-A57 scheduling model.
The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td.
Small changes in cpp,.h files to support required scheduling predicates.

Scheduling model implemented according to:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf.

Patch by : Andrew Zhogin (submitted on his behalf, as requested).
Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls.
Differential Revision: https://reviews.llvm.org/D28152

llvm-svn: 304530
2017-06-02 08:53:19 +00:00
Florian Hahn fca7b8348f [ARM] Create relocations for Thumb functions calling ARM fns in ELF.
Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.

Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls

Reviewed By: peter.smith

Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33436

llvm-svn: 304413
2017-06-01 13:50:57 +00:00
Matthias Braun d6a36ae282 TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Matthias Braun 05eeadbfd1 ARM: Fix cmpxchg O0 expansion
This is the equivalent of r304048 for ARM:

- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304267
2017-05-31 01:21:35 +00:00
Matthias Braun 0dba4e3509 ARM: Do not add reserved registers to block livein lists; NFC
llvm-svn: 304266
2017-05-31 01:21:30 +00:00
Matthias Braun 5e394c3d6f TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.

While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.

llvm-svn: 304247
2017-05-30 21:36:41 +00:00
Matthias Braun 700603555a ARM: Add missing flags to TBB_[JH]T pseudo instructions
NFC except for calming down the machine verifier in some cases.

llvm-svn: 304227
2017-05-30 18:52:33 +00:00
Craig Topper f6d4dc5b4a [SelectionDAG] Set ISD::FPOWI to Expand by default
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215
2017-05-30 15:27:55 +00:00
Diana Picus 0c05cce4e0 [ARM] GlobalISel: Extract helper. NFCI.
Create a helper to deal with the common code for merging incoming values
together after they've been split during call lowering. There's likely
more stuff that can be commoned up here, but we'll leave that for later.

llvm-svn: 304143
2017-05-29 09:09:54 +00:00
Diana Picus bf4aed2c38 [ARM] GlobalISel: Support array returns
These are a bit rare in practice, but they don't require anything
special compared to array parameters, so support them as well.

llvm-svn: 304137
2017-05-29 08:19:19 +00:00
Diana Picus 8cca8cb0ce [ARM] GlobalISel: Support array parameters/arguments
Clang coerces structs into arrays, so it's a good idea to support them.
Most of the support boils down to getting the splitToValueTypes helper
to actually split types. We then use G_INSERT/G_EXTRACT to deal with the
parts.

llvm-svn: 304132
2017-05-29 07:01:52 +00:00
Matthias Braun ac4307c41e LivePhysRegs: Rework constructor + documentation; NFC
- Take reference instead of pointer to a TRI that cannot be nullptr.
- Improve documentation comments.

llvm-svn: 304038
2017-05-26 21:51:00 +00:00
John Brawn 9009d2905d [ARM] Fix lowering of misaligned memcpy/memset
Currently getOptimalMemOpType returns i32 for large enough sizes without
checking for alignment, leading to poor code generation when misaligned accesses
aren't permitted as we generate a word store then later split it up into byte
stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
memset we splat the memset value into a word then immediately split it up
again.

Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
to use, but also fix a bug there where it wasn't correctly checking if
misaligned memory accesses are allowed.

Differential Revision: https://reviews.llvm.org/D33442

llvm-svn: 303990
2017-05-26 13:59:12 +00:00
Florian Hahn d211fe7c26 [ARM] Remove ThumbTargetMachines. (NFC)
Summary:
Thumb code generation is controlled by ARMSubtarget and the concrete
ThumbLETargetMachine and ThumbBETargetMachine are not needed.

Eric Christopher suggested removing the unneeded target machines in
https://reviews.llvm.org/D33287.

I think it still makes sense to keep separate TargetMachines for big and
little endian as we probably do not want to have different endianess for
difference functions in a single compilation unit. The MIPS backend has
two separate TargetMachines for big and little endian as well. 

Reviewers: echristo, rengolin, kristof.beyls, t.p.northover

Reviewed By: echristo

Subscribers: aemerson, javed.absar, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33318

llvm-svn: 303733
2017-05-24 10:18:57 +00:00
Javed Absar a32e3a1acf [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI
This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.
Reviewed by: Renato Golin, Diana Picus
Differential Revision: https://reviews.llvm.org/D33120

llvm-svn: 303717
2017-05-24 05:32:48 +00:00
Oleg Ranevskyy 09df0020fc [ARM] Temporarily disable globals promotion to constant pools to prevent miscompilation
Summary:
A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries.

The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed.

Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1.

Reviewers: efriedma, jmolloy

Reviewed By: efriedma

Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar

Differential Revision: https://reviews.llvm.org/D33446

llvm-svn: 303679
2017-05-23 19:38:37 +00:00
Nirav Dave 6c910c0dd8 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
James Molloy 6110be9759 Re-apply r302416: [ARM] Clear the constant pool cache on explicit .ltorg directives
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.

Original commit message:
    Multiple ldr pseudoinstructions with the same constant value will
    reuse the same constant pool entry. However, if the constant pool
    is explicitly flushed with a .ltorg directive, we should not try
    to reference constants in the previous pool any longer, since they
    may be out of range.

    This fixes assembling hand-written assembler source which repeatedly
    loads the same constant value, across a binary size larger than the
    pc-relative fixup range for ldr instructions (4096 bytes). Such
    assembler source already uses explicit .ltorg instructions to emit
    constant pools with regular intervals. However if we try to reuse
    constants emitted in earlier pools, they end up out of range.

    This makes the output of the testcase match what binutils gas does
    (prior to this patch, it would fail to assemble).

    Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 303540
2017-05-22 09:42:07 +00:00
James Molloy 5cc75ae8f9 Revert "[ARM] Clear the constant pool cache on explicit .ltorg directives"
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).

This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.

llvm-svn: 303536
2017-05-22 08:49:28 +00:00
Francis Visoiu Mistrih 8b61764cbb [LegacyPassManager] Remove TargetMachine constructors
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.

The patterns replaced here are:

* Passes handling a null TargetMachine call
  `getAnalysisIfAvailable<TargetPassConfig>`.

* Passes not handling a null TargetMachine
  `addRequired<TargetPassConfig>` and call
  `getAnalysis<TargetPassConfig>`.

* MachineFunctionPasses now use MF.getTarget().

* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.

This fixes a crash when running `llc -start-before prologepilog`.

PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.

Related to PR30324.

Differential Revision: https://reviews.llvm.org/D33222

llvm-svn: 303360
2017-05-18 17:21:13 +00:00
Diana Picus eafa4aa910 Reland r303247: [ARM] GlobalISel: Remove dead instruction selection code
It only failed on llvm-clang-x86_64-expensive-checks-win, probably
because the TableGen stuff hasn't been regenerated.
Requires a clean build.

llvm-svn: 303252
2017-05-17 12:42:52 +00:00
Diana Picus 36e4ba0f6e Revert "[ARM] GlobalISel: Remove dead instruction selection code"
This reverts commit r303247 because the tests are failing on some bots.
Sorry!

llvm-svn: 303249
2017-05-17 11:56:07 +00:00
Diana Picus 68d21c864e [ARM] GlobalISel: Remove dead instruction selection code
We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove
the hand-written versions.

llvm-svn: 303247
2017-05-17 11:39:26 +00:00
Francis Visoiu Mistrih b52e036600 BitVector: add iterators for set bits
Differential revision: https://reviews.llvm.org/D32060

llvm-svn: 303227
2017-05-17 01:07:53 +00:00
Renato Golin d69570e017 Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
Revert "[ARM] Mark LEApcrel as not having side effects"

This reverts commit r303054 and r303053, as they broke the ARM
self-hosting buildbots:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845

Offline investigation on course.

llvm-svn: 303193
2017-05-16 17:59:07 +00:00
John Brawn 9486becf09 [ARM] Mark LEApcrel instructions as isAsCheapAsAMove
Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.

Differential Revision: https://reviews.llvm.org/D32858

llvm-svn: 303054
2017-05-15 11:57:54 +00:00
John Brawn 43132c46a6 [ARM] Mark LEApcrel as not having side effects
Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.

It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.

Differential Revision: https://reviews.llvm.org/D32857

llvm-svn: 303053
2017-05-15 11:50:21 +00:00
Diana Picus 9cfbc6d94f [ARM][GlobalISel] Legalize narrow scalar ops by widening
This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB
and MUL to 32 bits since we only have TableGen patterns for 32 bits.
See the commit message for r292827 for more details.

At this point we could just remove some of the tests for regbankselect
and instruction-select, since we're not going to see any narrow
operations at those levels anymore. Instead I decided to update them
with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences
generated by the legalizer.

llvm-svn: 302782
2017-05-11 09:45:57 +00:00
Diana Picus 657bfd3302 [ARM][GlobalISel] Support for G_ANYEXT
G_ANYEXT can be introduced by the legalizer when widening scalars. Add
support for it in the register bank info (same mapping as everything
else) and in the instruction selector.

When selecting it, we treat it as a COPY, just like G_TRUNC. On this
occasion we get rid of some assertions in selectCopy so we can reuse it.
This shouldn't be a problem at the moment since we're not supporting any
complicated cases (e.g. FPR, different register banks). We might want to
separate the paths when we do.

llvm-svn: 302778
2017-05-11 08:28:31 +00:00
Serge Guelton e38003f839 Suppress all uses of LLVM_END_WITH_NULL. NFC.
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.

Differential Revision: https://reviews.llvm.org/D32541

llvm-svn: 302571
2017-05-09 19:31:13 +00:00
Tim Shen 04de70d3a7 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Aaron Ballman 3234647df6 Amend r302535; ifndef and ifdef are different, as it turns out.
llvm-svn: 302537
2017-05-09 15:12:03 +00:00
Aaron Ballman 06297e839a ARMRegisterBankInfo.h requires LLVM_BUILD_GLOBAL_ISEL to be defined. If it is not defined, then ARMGenRegisterBank.inc is not table generated and the inclusion of this header causes the build to fail.
llvm-svn: 302535
2017-05-09 14:59:48 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Diana Picus 95640a1c4d [ARM GlobalISel] Remove hand-written G_FADD selection
Remove the code selecting G_FADD - now that TableGen can handle more
opcodes, it's not needed anymore.

llvm-svn: 302511
2017-05-09 08:32:42 +00:00
Tim Northover c48c993b75 ARM: use divmod libcalls on embedded MachO platforms too.
The separated libcalls are implemented in terms of __divmodsi4 and __udivmodsi4
anyway, so we should always use them if possible.

llvm-svn: 302462
2017-05-08 20:00:14 +00:00
Craig Topper 8297e52285 [ARM] Use a Changed flag to avoid making a pass's return value dependent on a compare with a Statistic object.
Statistic compile to always be 0 in release build so this compare would always return false. And in the debug builds Statistic are global variables and remember their values across pass runs. So this compare returns true anytime the pass runs after the first time it modifies something.

This was found after reviewing all usages of comparison operators on a Statistic object. We had some internal code that did a compare with a statistic that caused a mismatch in output between debug and release builds. So we did an audit out of paranoia.

llvm-svn: 302450
2017-05-08 18:02:51 +00:00
Simon Pilgrim f5ca255d18 [ARM][NEON] Add support for ISD::ABS lowering
Update NEON int_arm_neon_vabs intrinsic to use the ISD::ABS opcode directly

Added constant folding tests.

Differential Revision: https://reviews.llvm.org/D32938

llvm-svn: 302417
2017-05-08 10:37:34 +00:00
Martin Storsjo fd4c158a84 [ARM] Clear the constant pool cache on explicit .ltorg directives
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.

This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.

This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).

Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 302416
2017-05-08 10:26:24 +00:00
Quentin Colombet 245994d968 [RegisterBankInfo] Uniquely allocate instruction mapping.
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.

NFC.

llvm-svn: 302316
2017-05-05 22:48:22 +00:00
Matthias Braun 4682ac6c83 ARM: Compute MaxCallFrame size early
This exposes a method in MachineFrameInfo that calculates
MaxCallFrameSize and calls it after instruction selection in the ARM
target.

This avoids
ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame()
giving different answers in early/late phases of codegen.

The testcase shows a particular nasty example result of that where we
would fail to properly align an alloca.

Differential Revision: https://reviews.llvm.org/D32622

llvm-svn: 302303
2017-05-05 22:04:05 +00:00
Craig Topper f0aeee01c3 [KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits.
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.

Differential Revision: https://reviews.llvm.org/D32637

llvm-svn: 302262
2017-05-05 17:36:09 +00:00
John Brawn 1b74f8c51f [ARM] Add support for ORR and ORN instruction substitutions
Recently support was added for substituting one intruction for another by
negating or inverting the immediate, but ORR and ORN were missed so this patch
adds them.

This one is slightly different to the others in that ORN only exists in thumb,
so we only do the substitution in thumb.

Differential Revision: https://reviews.llvm.org/D32534

llvm-svn: 302224
2017-05-05 11:31:25 +00:00
Sam Parker df337704f0 [ARM] ACLE Chapter 9 intrinsics
Added the integer data processing intrinsics from ACLE v2.1 Chapter 9
but I have missed out the saturation_occurred intrinsics for now. For
the instructions that read and write the GE bits, a chain is included
and the only instruction that reads these flags (sel) is only
selectable via the implemented intrinsic.

Differential Revision: https://reviews.llvm.org/D32281

llvm-svn: 302126
2017-05-04 07:31:28 +00:00
Reid Kleckner a0b45f4bfc [IR] Abstract away ArgNo+1 attribute indexing as much as possible
Summary:
Do three things to help with that:
- Add AttributeList::FirstArgIndex, which is an enumerator currently set
  to 1. It allows us to change the indexing scheme with fewer changes.
- Add addParamAttr/removeParamAttr. This just shortens addAttribute call
  sites that would otherwise need to spell out FirstArgIndex.
- Remove some attribute-specific getters and setters from Function that
  take attribute list indices.  Most of these were only used from
  BuildLibCalls, and doesNotAlias was only used to test or set if the
  return value is malloc-like.

I'm happy to split the patch, but I think they are probably easier to
review when taken together.

This patch should be NFC, but it sets the stage to change the indexing
scheme to this, which is more convenient when indexing into an array:
  0: func attrs
  1: retattrs
  2...: arg attrs

Reviewers: chandlerc, pete, javed.absar

Subscribers: david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D32811

llvm-svn: 302060
2017-05-03 18:17:31 +00:00
Tim Northover 4a01ffbd6a ARM: avoid handing a deleted node back to TableGen during ISel.
When we replaced the multiplicand the destination node might already exist.
When that happens the original gets CSEd and deleted. However, it's actually
used as the offset so nonsense is produced.

Should fix PR32726.

llvm-svn: 301983
2017-05-02 22:45:19 +00:00
Tim Northover f9d8eee3db ARM: add arm1176j-f processor
I doubt anyone actually uses it, and I'm not even entirely convinced it exists
myself; but it is our default for "clang -arch armv6". Functionally, if it does
exist it's identical to the arm1176jz-f from LLVM's point of view (the
difference is apparently in the "Security Extensions").

llvm-svn: 301962
2017-05-02 19:06:13 +00:00
Diana Picus 8abcbbb24b [ARM] GlobalISel: Use TableGen instruction selector
Emit and use the TableGen instruction selector for ARM. At the moment,
this allows us to remove the hand-written code for selecting G_SDIV and
G_UDIV.

Future commits will focus on increasing the code coverage for it and
removing more dead code from the current instruction selector.

llvm-svn: 301905
2017-05-02 09:40:49 +00:00
Reid Kleckner 6652a52e2b Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666
2017-04-28 18:37:16 +00:00
Diana Picus 6f975692e5 [ARM] GlobalISel: fixup r301632
Actually remove ARMInstructionSelector.h... Forgot to stage the removal
in the previous commit.

llvm-svn: 301633
2017-04-28 09:20:31 +00:00
Diana Picus 674888d84c [ARM] GlobalISel: Get rid of ARMInstructionSelector.h. NFC.
Declare the ARMInstructionSelector in an anonymous namespace, to make it
more in line with the other targets which were migrated to this in
r299637 in order to avoid TableGen'erated headers being included in
non-GlobalISel builds.

llvm-svn: 301632
2017-04-28 09:10:38 +00:00
Craig Topper d0af7e8ab8 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Diana Picus 4f46be327c [ARM] GlobalISel: Fix extended stack operands
Fix a crash when trying to extend a value passed as a sign- or
zero-extended stack parameter. The cause of the crash was that we were
setting the size of the loaded value to 32 bits, and then tyring to
extend again to 32 bits.

This patch addresses the issue by also introducing a G_TRUNC after the
load. This will leave the unused bits to their original values set by
the caller, while being consistent about the types. For values that are
not extended, we just use a smaller load.

llvm-svn: 301531
2017-04-27 10:23:30 +00:00
Krzysztof Parzyszek 44e25f37ae Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221
2017-04-24 18:55:33 +00:00
Diana Picus f53865daa4 [ARM] GlobalISel: Legalize s8 and s16 G_(S|U)DIV
We have to widen the operands to 32 bits and then we can either use
hardware division if it is available or lower to a libcall otherwise.

At the moment it is not enough to set the Legalizer action to
WidenScalar, since for libcalls it won't know what to do (it won't be
able to find what size to widen to, because it will find Libcall and not
Legal for 32 bits). To hack around this limitation, we request Custom
lowering, and as part of that we widen first and then we run another
legalizeInstrStep on the widened DIV.

llvm-svn: 301166
2017-04-24 09:12:19 +00:00
Diana Picus b70e88bdec [ARM] GlobalISel: Support G_(S|U)DIV for s32
Add support for both targets with hardware division and without. For
hardware division we have to add support throughout the pipeline
(legalizer, reg bank select, instruction select). For targets without
hardware division, we only need to mark it as a libcall.

llvm-svn: 301164
2017-04-24 08:20:05 +00:00
Diana Picus 95a8aa93e2 [ARM] GlobalISel: Select G_CONSTANT with CImm operands
When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm
operand. We used to just leave the G_CONSTANT operand unchanged, which
works in some cases (such as the GEP offsets that we create when
referring to stack slots). However, in many other places the G_CONSTANTs
are created with CImm operands. This patch makes sure to handle those as
well, and to error out gracefully if in the end we don't end up with an
Imm operand.

Thanks to Oliver Stannard for reporting this issue.

llvm-svn: 301162
2017-04-24 06:30:56 +00:00
Daniel Sanders 2deea1878e [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

llvm-svn: 301079
2017-04-22 15:11:04 +00:00
Hans Wennborg 9b9a5358dd Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows.  This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.

llvm-svn: 301047
2017-04-21 21:48:41 +00:00
Hans Wennborg 04593000d8 Revert r301040 "X86: Don't emit zero-byte functions on Windows"
This broke almost all bots. Reverting while fixing.

llvm-svn: 301041
2017-04-21 21:10:37 +00:00
Hans Wennborg cb3e810714 X86: Don't emit zero-byte functions on Windows
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.

We had a terrific bug due to this in Chromium.

It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.

Differential Revision: https://reviews.llvm.org/D32330

llvm-svn: 301040
2017-04-21 20:58:12 +00:00
Tim Northover e31cf3f824 ARM: make sure we use all entries in a vector before forming a vpaddl.
Otherwise there's some mismatch, and we'll either form an illegal type or an
illegal node.

Thanks to Eli Friedman for pointing out the problem with my original solution.

llvm-svn: 301036
2017-04-21 20:35:52 +00:00
Tim Northover 1061ccca8c ARM: don't try to create an i8 -> i32 vpaddl.
DAG combine was mistakenly assuming that the step-up it was looking at was
always a doubling, but it can sometimes be a larger extension in which case
we'd crash.

llvm-svn: 301002
2017-04-21 17:21:59 +00:00
Diana Picus 64a33431eb [ARM] GlobalISel: Add support for G_TRUNC
Select them as copies. We only select if both the source and the
destination are on the same register bank, so this shouldn't cause any
trouble.

llvm-svn: 300971
2017-04-21 13:16:50 +00:00
Diana Picus f941ec0ecc [ARM] GlobalISel: Make struct arguments fail elegantly
The condition in isSupportedType didn't handle struct/array arguments
properly. Fix the check and add a test to make sure we use the fallback
path in this kind of situation. The test deals with some common cases
where the call lowering should error out. There are still some issues
here that need to be addressed (tail calls come to mind), but they can
be addressed in other patches.

llvm-svn: 300967
2017-04-21 11:53:01 +00:00
Artyom Skrobov 8d9643009f [Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing `Uses = [CPSR]`
Summary: Thanks to Oliver Stannard for helping catch this.

Reviewers: olista01, efriedma

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31815

llvm-svn: 300951
2017-04-21 07:35:21 +00:00
Tim Northover 46e58354da ARM: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300904
2017-04-20 21:56:52 +00:00
Tim Northover 8b1240b0f0 ARM: handle post-indexed NEON ops where the offset isn't the access width.
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.

Should fix PR32658.

llvm-svn: 300878
2017-04-20 19:54:02 +00:00
Weiming Zhao 962c5a3aec [Thumb-1] Fix corner cases for compressed jump tables
Summary:
When synthesized TBB/TBH is expanded, we need to avoid the case of:
   BaseReg is redefined after the load of branching target. E.g.:

    %R2 = tLEApcrelJT <jt#1>
    %R1 =  tLDRr %R1, %R2    ==> %R2 = tLEApcrelJT <jt#1>
    %R2 = tLDRspi %SP, 12        %R2 = tLDRspi %SP, 12
    tBR_JTr %R1                  tTBB_JT %R2, %R1
`
Reviewers: jmolloy

Reviewed By: jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D32250

llvm-svn: 300870
2017-04-20 18:37:14 +00:00
Benjamin Kramer 58dadd59d9 Fix use-after-frees on memory allocated in a Recycler.
This will become asan errors once the patch lands that poisons the
memory after free. The x86 change is a hack, but I don't see how to
solve this properly at the moment.

llvm-svn: 300867
2017-04-20 18:29:14 +00:00
John Brawn 66719f63d0 [ARM] Fix handling of mapping symbols when changing sections
ChangeSection incorrectly registers LastEMSInfo as belonging to the previous
section, not the current section. This happens to work when changing sections
using .section, as the previous section is set to the current section before
the call to ChangeSection, but not when using .popsection.

Differential Revision: https://reviews.llvm.org/D32225

llvm-svn: 300831
2017-04-20 10:18:13 +00:00
Diana Picus 7c6dee9f16 [ARM] Rename HW div feature to HW div Thumb. NFCI.
The hardware div feature refers only to Thumb, but because of its name
it is tempting to use it to check for hardware division in general,
which may cause problems in ARM mode. See https://reviews.llvm.org/D32005.

This patch adds "Thumb" to its name, to make its scope clear. One
notable place where I haven't made the change is in the feature flag
(used with -mattr), which is still hwdiv. Changing it would also require
changes in a lot of tests, including clang tests, and it doesn't seem
like it's worth the effort.

Differential Revision: https://reviews.llvm.org/D32160

llvm-svn: 300827
2017-04-20 09:38:25 +00:00
Matthias Braun 8aaa368d00 ARMFrameLowering: Reserve emergency spill slot for large arguments
Re-commit after revert in r300668. Changed getMaxFPOffset() to a
more conservative heuristic instead of trying to be clever and missing
for some exotic calling conventions.

We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300761
2017-04-19 21:11:44 +00:00
Eli Friedman 70ad2751d5 [ARM] Remove redundant computeKnownBits helper.
Move the BFI logic to computeKnownBitsForTargetNode, and delete
the redundant CMOV logic.

This is intended as a cleanup, but it's probably possible to construct
a case where moving the BFI logic allows more combines.

Differential Revision: https://reviews.llvm.org/D31795

llvm-svn: 300752
2017-04-19 20:50:57 +00:00
Eli Friedman f281d490cc [ARM] Use TableGen patterns to select vtbl. NFC.
Differential Revision: https://reviews.llvm.org/D32103

llvm-svn: 300749
2017-04-19 20:39:39 +00:00
Tim Northover ff168c68dc ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin.
llvm-svn: 300726
2017-04-19 18:07:54 +00:00
Renato Golin 742aed8683 Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments"
This reverts commit r300639, as it broke self-hosting on ARM. PR32709.

llvm-svn: 300668
2017-04-19 09:02:52 +00:00
Diana Picus 49472ff1cf [ARM] GlobalISel: Add support for G_MUL
Support G_MUL, very similar to G_ADD and G_SUB. The only difference is
in the instruction selector, where we have to select either MUL or MULv5
depending on the target.

llvm-svn: 300665
2017-04-19 07:29:46 +00:00
Serge Pavlov 5943a96d81 ARM: Use methods to access data stored with frame instructions
In r300196 several methods were added to TarfetInstrInfo to access
data stored with call frame setup/destroy instructions. This change
replaces calls to getOperand with calls to such special methods in
ARM target.

Differential Revision: https://reviews.llvm.org/D32127

llvm-svn: 300655
2017-04-19 03:12:05 +00:00
Matthias Braun 661d3d4b00 ARMFrameLowering: Reserve emergency spill slot for large arguments
We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300639
2017-04-19 01:16:07 +00:00
Matt Arsenault 3138075dd4 DAG: Make mayBeEmittedAsTailCall parameter const
llvm-svn: 300603
2017-04-18 21:16:46 +00:00
Oliver Stannard 7ad2e8aae1 [ARM] Add hardware build attributes in assembler
In the assembler, we should emit build attributes based on the target
selected with command-line options. This matches the GNU assembler's
behaviour. We only do this for build attributes which describe the
hardware that is expected to be available, not the ones that describe
ABI compatibility.

This is done by moving some of the attribute emission code to
ARMTargetStreamer, so that it can be shared between the assembly and
code-generation code paths. Since the assembler only creates a
MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
check raw features, and not use the convenience functions in
ARMSubtarget.

If different attributes are later specified using the .eabi_attribute
directive, then they will take precedence, as happens when the same
.eabi_attribute is specified twice.

This must be enabled by an option, because we don't want to do this when
parsing inline assembly. The attributes would match the ones emitted at
the start of the file, so wouldn't actually change the emitted object
file, but the extra directives would be added to every inline assembly
block when emitting assembly, which we'd like to avoid.

The majority of the changes in the build-attributes.ll test are just
re-ordering the directives, because the hardware attributes are now
emitted before the ABI ones. However, I did fix one bug which I spotted:
Tag_CPU_arch_profile was not being emitted for v6M.

Differential revision: https://reviews.llvm.org/D31812

llvm-svn: 300547
2017-04-18 12:52:35 +00:00
Diana Picus a3a0cccb2c [ARM] GlobalISel: Add support for G_SUB
Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
as G_ADD, nothing fancy.

llvm-svn: 300546
2017-04-18 12:35:28 +00:00
Diana Picus e2626bb7c2 [ARM] Check for correct HW div when lowering divmod
For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
                                      : Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

llvm-svn: 300536
2017-04-18 08:32:27 +00:00
Reid Kleckner fb502d2f5e [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Andrew V. Tischenko 75745d0c3e This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs.
The details are here: https://reviews.llvm.org/D30941

llvm-svn: 300311
2017-04-14 07:44:23 +00:00
Jonas Paulsson fccc7d66c3 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Sam Parker 83b64654fd [ARM] Refactor Thumb2 sat instructions
Refactor the USAT, SSAT, USAT16 and SSAT16 instruction descriptions
for Thumb2.

Differential Revision: https://reviews.llvm.org/D31933

llvm-svn: 299945
2017-04-11 14:42:08 +00:00
Diana Picus 1314a2889c GlobalISel: Allow legalizing G_FADD to a libcall
Use the same handling in the generic legalizer code as for the other
libcalls (G_FREM, G_FPOW).

Enable it on ARM for float and double so we can test it.

llvm-svn: 299931
2017-04-11 10:52:34 +00:00
Matthew Simpson 1468d3e04e [ARM/AArch64] Ensure valid vector element types for interleaved accesses
This patch refactors and strengthens the type checks performed for interleaved
accesses. The primary functional change is to ensure that the interleaved
accesses have valid element types. The added test cases previously failed
because the element type is f128.

Differential Revision: https://reviews.llvm.org/D31817

llvm-svn: 299864
2017-04-10 18:34:37 +00:00
Diana Picus 3ff82c8cb7 [ARM] GlobalISel: Support G_FPOW for float and double
Legalize to a libcall.

llvm-svn: 299841
2017-04-10 09:27:39 +00:00
Eli Friedman 75631c97ba [ARM] Prefer BIC over BFC in ARM mode.
BIC is generally faster, and it can put the output in a different
register from the input.

We already do this in Thumb2 mode; not sure why the equivalent fix
never got applied to ARM mode.

Differential Revision: https://reviews.llvm.org/D31797

llvm-svn: 299803
2017-04-07 22:01:23 +00:00
Diana Picus 3c608448e1 [ARM] GlobalISel: Support frem for 64-bit values
Legalize to a libcall.

llvm-svn: 299756
2017-04-07 10:50:02 +00:00
Diana Picus a5bab61a8d [ARM] GlobalISel: Support frem for 32-bit values
Legalize to a libcall.
On this occasion, also start allowing soft float subtargets. For the
moment G_FREM is the only legal floating point operation for them.

llvm-svn: 299753
2017-04-07 09:41:39 +00:00
Yi Kong 60b5a1cd17 Revert "Revert "[ARM] Add Kryo to available targets""
This reverts commit dc9458d5a747a02a9a8f198b84c2b92a6939a8dd.

Added missing case for PreISelOperandLatencyAdjustment.

llvm-svn: 299724
2017-04-06 22:47:47 +00:00
Huihui Zhang 98240e9643 [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL
In LowerMUL, the chain information is not preserved for the new
created Load SDNode.

For example, if a Store alias with one of the operand of Mul.
The Load for that operand need to be scheduled before the Store.
The dependence is recorded in the chain of Store, in TokenFactor.
However, when lowering MUL, the SDNodes for the new Loads for
VMULL are not updated in the TokenFactor for the Store. Thus the
chain is not preserved for the lowered VMULL.

llvm-svn: 299701
2017-04-06 20:22:51 +00:00
Yi Kong 5e7059b702 Revert "[ARM] Add Kryo to available targets"
This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4.

Build breakage.

llvm-svn: 299689
2017-04-06 19:16:14 +00:00
Yi Kong 2b622b1fc1 [ARM] Add Kryo to available targets
Summary:
Host CPU detection now supports Kryo, so we need to recognize it in ARM
target.

Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines

Reviewed By: t.p.northover, echristo

Subscribers: aemerson

Differential Revision: https://reviews.llvm.org/D31775

llvm-svn: 299674
2017-04-06 18:10:08 +00:00
David Green 1b4b59a415 [ARM] Remove a dead ADD during the creation of TBBs
During the optimisation of jump tables in the constant island pass,
an extra ADD could be left over, now dead but not removed.

Differential Revision: https://reviews.llvm.org/D31389

llvm-svn: 299634
2017-04-06 08:32:47 +00:00
Matthias Braun 44047427b1 ARMFrameLowering: Slight cleanups; NFC
llvm-svn: 299562
2017-04-05 16:58:41 +00:00