Commit Graph

30331 Commits

Author SHA1 Message Date
Ed Maste 6d0bee5fc2 DebugInfo: .debug_line DWARF64 support
This adds support for the 64-bit DWARF format, but is still limited to
less than 4GB of debug data by the DataExtractor class.  Some versions
of the GNU MIPS toolchain generate 64-Bit DWARF even though it isn't
actually necessary.

Differential Revision: http://reviews.llvm.org/D1988

llvm-svn: 238434
2015-05-28 15:38:17 +00:00
Rafael Espindola e2b355d651 Don't create an unused _GLOBAL_OFFSET_TABLE_.
This was a bug for bug compatibility with gas that is completely unnecessary.
If a _GLOBAL_OFFSET_TABLE_ symbol is used, it will already be created by
the time we get to the ELF writer.

llvm-svn: 238432
2015-05-28 15:20:00 +00:00
Daniel Sanders 3985530328 [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.

Reviewers: petarj

Reviewed By: petarj

Subscribers: srhines, joerg, tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9669

llvm-svn: 238427
2015-05-28 14:52:15 +00:00
Petar Jovanovic 9720283e99 [Mips64] Add support for MCJIT for MIPS64r2 and MIPS64r6
Add support for resolving MIPS64r2 and MIPS64r6 relocations in MCJIT.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D9667

llvm-svn: 238424
2015-05-28 13:48:41 +00:00
Yury Gribov 98b18599a6 [ASan] New approach to dynamic allocas unpoisoning. Patch by Max Ostapenko!
Differential Revision: http://reviews.llvm.org/D7098

llvm-svn: 238402
2015-05-28 07:51:49 +00:00
David Majnemer 587336d2ad [Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a win
Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't *know*
we will open up CSE opportunities.

If the multiply was 'nsw', then negating 'y' requires us to clear the
'nsw' flag.  If this is actually worth pursuing, it is probably more
appropriate to do so in GVN or EarlyCSE.

This fixes PR23675.

llvm-svn: 238397
2015-05-28 06:16:39 +00:00
Jingyue Wu c2a014697a [NaryReassociate] Run EarlyCSE after NaryReassociate
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline

1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.

2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.

Test Plan: updated @reassociate_gep_nsw in nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: dberlin, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9947

llvm-svn: 238396
2015-05-28 04:56:52 +00:00
Chandler Carruth 88862eaefa [x86] Refactor the tests for popcnt.
Extracted from the D6531 patch by Bruno Cardoso Lopes, and re-generated
to reflect the current state of the world. This should let Bruno's D6531
actually show the delta between the approaches by running the x86 test
case update script after re-building.

llvm-svn: 238391
2015-05-28 02:40:15 +00:00
Lang Hames 8b34f82462 [RuntimeDyld] Fix MachO i386 SECTDIFF relocation to support non-zero addends.
Previously, relocations of the form 'A - B + C' would fail on i386 when C was
non-zero.

llvm-svn: 238356
2015-05-27 20:50:01 +00:00
Diego Novillo df4837ba6b Final fix for PR 23499 and IR test case.
This fixes a bit I forgot in r238335. In addition to the data record and
the counter, we can also move the name of the counter to the comdat for
the associated function.

I'm also adding an IR test case to check that these three elements are
placed in the proper comdat.

llvm-svn: 238351
2015-05-27 19:34:01 +00:00
Renato Golin f7c0d5f247 ARMTargetParser: Normalising build attributes
Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu
strings are using ARMTargetParser, it's time to make it a bit more conforming
with what the ABI says.

This commit adds some clarification on what build attributes are accepted and
which are "non-standard". It also makes clear that the "defaultCPU" and
"defaultArch" methods were really just build attribute getters.

It also diverges from GCC's behaviour to say that armv2/armv3 are really an
ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4.

llvm-svn: 238344
2015-05-27 18:15:37 +00:00
Alex Lorenz 2bdb4e1063 Resubmit r237954 (MIR Serialization: print and parse LLVM IR using MIR format).
This commit a 3rd attempt at comitting the initial MIR serialization patch.
The first commit (r237708) was reverted in 237730. Then the second commit
(r237954) was reverted in r238007, as the MIR library under CodeGen caused
a circular dependency where the CodeGen library depended on MIR and MIR
library depended on CodeGen.

This commit has fixed the dependencies between CodeGen and MIR by
reorganizing the MIR serialization code - the code that prints out
MIR has been moved to CodeGen, and the MIR library has been renamed
to MIRParser. Now the CodeGen library doesn't depend on the
MIRParser library, thus the circular dependency no longer exists.

--Original Commit Message--

MIR Serialization: print and parse LLVM IR using MIR format.

This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR
using the MIR format. This pass is then added as a last pass when a
'stop-after' option is used in llc. The new library adds the initial
functionality for parsing of MIR files as well. This commit also
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616 

llvm-svn: 238341
2015-05-27 18:02:19 +00:00
Zoran Jovanovic 85a53a1ed5 [mips][microMIPSr6] Implement SEB and SEH instructions
Differential Revision: http://reviews.llvm.org/D9739

llvm-svn: 238333
2015-05-27 15:39:47 +00:00
Jozef Kolek 888830adfe [mips][microMIPSr6] Implement BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC and BNEZALC instructions
This patch implements microMIPS32r6 BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC
and BNEZALC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D10031

llvm-svn: 238325
2015-05-27 14:19:22 +00:00
Elena Demikhovsky 86c7b46680 AVX-512: Fixed a bug in extracting subvector from v64i1
By Igor Breger (igor.breger@intel.com)

llvm-svn: 238322
2015-05-27 14:09:33 +00:00
Daniel Sanders 8ef465f4bb Revert r238190 and r238197: [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
This broke the llvm-mips-linux builder and several of our out-of-tree builders.
Initial investigations show that the commit probably isn't the problem but
reverting anyway while I investigate.

llvm-svn: 238302
2015-05-27 08:44:01 +00:00
Elena Demikhovsky 3948c590e3 AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238301
2015-05-27 08:15:19 +00:00
Quentin Colombet aa8020752e [X86] Implement the support for shrink-wrapping.
With this patch the x86 backend is now shrink-wrapping capable
and this functionality can be tested by using the
-enable-shrink-wrap switch.

The next step is to make more test and enable shrink-wrapping by
default for x86.

Related to <rdar://problem/20821487>

llvm-svn: 238293
2015-05-27 06:28:41 +00:00
Chandler Carruth a004f22a2d [inliner] Fix the early-exit of the inline cost analysis to correctly
model the dense vector instruction bonuses.

Previously, this code really didn't effectively compute the density of
inlined vector instructions and apply the intended inliner bonus. It
would try to compute it repeatedly while analyzing the function and
didn't handle the case where future vector instructions would tip the
scales back towards the bonus.

Instead, speculatively apply all possible bonuses to the threshold
initially. Once we *know* that a certain bonus can not be applied,
subtract it. This should delay early bailout enough to get much more
consistent results without actually causing us to analyze huge swaths of
code. I expect some (hopefully mild) compile time hit here, and some
swings in performance, but this was definitely the intended behavior of
these bonuses.

This also dramatically simplifies the computation of the bonuses to not
interact with each other in confusing ways. The previous code didn't do
a good job of this and the values for bonuses may be surprising but are
at least now clearly written in the code.

Finally, fix code to be in line with comments and use zero as the
bailout condition.

Patch by Easwaran Raman, with some comment tweaks by me to try and
further clarify what is going on with this code.

http://reviews.llvm.org/D8267

llvm-svn: 238276
2015-05-27 02:49:05 +00:00
Filipe Cabecinhas 6a92a3fe34 [BitcodeReader] Change assert to report_fatal_error
It can be triggered by user input.

Bug found with AFL fuzz.

llvm-svn: 238272
2015-05-27 01:05:40 +00:00
Filipe Cabecinhas 8cd99e9a5a [BitstreamReader] Make sure the Array operand type is an encoding
Bug found with AFL fuzz.

llvm-svn: 238269
2015-05-27 00:48:43 +00:00
Filipe Cabecinhas bc6a909384 [BitcodeReader] Make sure abbrev records have at least one operand (record code)
Bug found with AFL fuzz.

llvm-svn: 238265
2015-05-26 23:52:21 +00:00
Owen Anderson 85fa7d5037 Add initial support for the convergent attribute.
llvm-svn: 238264
2015-05-26 23:48:40 +00:00
Filipe Cabecinhas 0eb8a59a67 [BitcodeReader] Sanity check on Comdat ID
Shouldn't be an assert, since user input can trigger it.

Bug found with AFL fuzz.

llvm-svn: 238261
2015-05-26 23:00:56 +00:00
Rafael Espindola 2fb8401b2a Print "lock \t foo" instead of "lock \n foo".
This gets gas and llc -filetype=obj to agree on the order of prefixes.

For llvm-mc we need to fix the asm parser to know that it makes a difference
on which line the "lock" is in.

Part of pr23594.

llvm-svn: 238232
2015-05-26 18:35:10 +00:00
Jan Vesely b670d37105 R600: Use SIGN_EXTEND_INREG for SEXT loads
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 238229
2015-05-26 18:07:22 +00:00
Diego Novillo bfecc06656 Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html

llvm-svn: 238223
2015-05-26 17:45:38 +00:00
Matt Arsenault 48b3b238cc Forgot to add lit.local.cfg for new R600 directory
llvm-svn: 238218
2015-05-26 17:01:16 +00:00
Matt Arsenault f05b02351f CodeGenPrepare: Don't match addressing modes through addrspacecast
This was resulting in the addrspacecast being removed and incorrectly
replaced with a ptrtoint when sinking.

llvm-svn: 238217
2015-05-26 16:59:43 +00:00
Tom Stellard 245c15fce2 R600/SI: Add assembler support for all CI and VI VOP2 instructions
llvm-svn: 238211
2015-05-26 15:55:52 +00:00
Luke Cheeseman a5d053d6f4 Re-commit changes in r237579 with fix for bug breaking windows builds.
llvm-svn: 238201
2015-05-26 13:40:31 +00:00
Elena Demikhovsky b2b901c607 AVX-512: fixed a bug in lowering VSELECT for 512-bit vector
https://llvm.org/bugs/show_bug.cgi?id=23634

llvm-svn: 238195
2015-05-26 11:32:39 +00:00
Michael Kuperstein db0712f986 Use std::bitset for SubtargetFeatures.
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures.
Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. 
This should now be fixed.

llvm-svn: 238192
2015-05-26 10:47:10 +00:00
Daniel Sanders 58ee4c9451 [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.

This commit uses DW_EH_PE_sdata8 for N64 as far as is possible at the moment.
However, it is possible to end up with DW_EH_PE_sdata4 when a TargetMachine is
not available. There's no risk of issues with inconsistency here since the
tables are self describing but it does mean there is a small chance of the
PC-relative offset being out of range for particularly large programs.

Reviewers: petarj

Reviewed By: petarj

Subscribers: srhines, joerg, tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9669

llvm-svn: 238190
2015-05-26 10:19:18 +00:00
Bjorn Steinbrink 236446cd4c Remove conflicting attributes before adding deduced readonly/readnone
Summary:
In case of functions that have a pointer argument and only pass it to
each other, the function attributes pass deduces that the pointer should
get the readnone attribute, but fails to remove a readonly attribute
that may already have been present.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9995

llvm-svn: 238152
2015-05-25 19:46:38 +00:00
Davide Italiano f071bd0a18 [llvm-readobj/ELF] Teach how to decode DF_1_XXX flags
llvm-readobj -dynamic-table output.
Before:
0x000000006FFFFFFB unknown

After:
0x000000006FFFFFFB FLAGS_1 NOW ORIGIN

Differential Revision:	http://reviews.llvm.org/D9958

llvm-svn: 238151
2015-05-25 19:12:18 +00:00
Simon Pilgrim 0be4fa761f [X86][AVX2] Vectorized i16 shift operators
Part of D9474, this patch extends AVX2 v16i16 types to 2 x 8i32 vectors and uses i32 shift variable shifts before packing back to i16.

Adds AVX2 tests for v8i16 and v16i16 

llvm-svn: 238149
2015-05-25 17:49:13 +00:00
Tom Stellard ec87f841c6 R600/SI: Fix bug with v_interp_p1_f32 instructions on 16 bank lds chips
The src and dst register cannot be the same on chips with 16 lds banks.

llvm-svn: 238147
2015-05-25 16:15:54 +00:00
Kit Barton 6646033e6e This patch adds support for the vector quadword add/sub instructions introduced
in POWER8:

vadduqm
vaddeuqm
vaddcuq
vaddecuq
vsubuqm
vsubeuqm
vsubcuq
vsubecuq
In addition to adding the instructions themselves, it also adds support for the
v1i128 type for intrinsics (Intrinsics.td, Function.cpp, and
IntrinsicEmitter.cpp).

http://reviews.llvm.org/D9081

llvm-svn: 238144
2015-05-25 15:49:26 +00:00
Michael Kuperstein f145228676 [X86] When pattern-matching scalar FMA3 intrinsics, don't re-arrange the first and second operands.
The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source.
The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied.

This modifies the pattern to leave src1 and src2 in their original order.

Differential Revision: http://reviews.llvm.org/D9908

llvm-svn: 238131
2015-05-25 12:35:25 +00:00
Elena Demikhovsky 1c1391ba24 Added promotion to EXTRACT_SUBVECTOR operand.
I encountered with this case in one of KNL tests for i1 vectors.
v16i1 = EXTRACT_SUBVECTOR v32i1, x

llvm-svn: 238130
2015-05-25 11:33:13 +00:00
Duncan P. N. Exon Smith 1a65e4ade4 AsmPrinter: Emit the DwarfStringPool offset directly when possible
Change `DwarfStringPool` to calculate byte offsets on-the-fly, and
update `DwarfUnit::getLocalString()` to use a `DIEInteger` instead of a
`DIEDelta` when Dwarf doesn't use relocations (i.e., Mach-O).  This
eliminates another call to `EmitLabelDifference()`, and drops memory
usage from 865 MB down to 861 MB, around 0.5%.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 238114
2015-05-24 16:14:59 +00:00
Duncan P. N. Exon Smith 004e73420d DebugInfo: Clarify test/DebugInfo/X86/stmt-list-multiple-compile-units.ll
This test was relying on the numbering of preceding .set directives, but
an upcoming commit is going to remove some of them.  Make the CHECKs
more nuanced.

llvm-svn: 238113
2015-05-24 16:10:35 +00:00
Matt Arsenault 65ad1602b0 Add target hook to allow merging stores of nonzero constants
On GPU targets, materializing constants is cheap and stores are
expensive, so only doing this for zero vectors was silly.

Most of the new testcases aren't optimally merged, and are for
later improvements.

llvm-svn: 238108
2015-05-24 00:51:27 +00:00
Benjamin Kramer be48c40475 [AArch64] Clean up the ELF streamer a bit.
llvm-svn: 238102
2015-05-23 16:39:10 +00:00
Hal Finkel 5f2a1379ef [PowerPC] Fix fast-isel when compare is split from branch
When the compare feeding a branch was in a different BB from the branch, we'd
try to "regenerate" the compare in the block with the branch, possibly trying
to make use of values not available there. Copy a page from AArch64's play book
here to fix the problem (at least in terms of correctness).

Fixes PR23640.

llvm-svn: 238097
2015-05-23 12:18:10 +00:00
Akira Hatanaka ddf76aa36f Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions.
This is part of the work to remove TargetMachine::resetTargetOptions.

In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.

There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim". 

llvm-svn: 238080
2015-05-23 01:14:08 +00:00
Akira Hatanaka aa115fbb7b Remove unnecessary command line option "-disable-fp-elim".
This option currently has no effect as function attribute
"no-frame-pointer-elim=false" overrides it.

llvm-svn: 238077
2015-05-23 00:31:56 +00:00
Rafael Espindola 445712264d Revert "make reciprocal estimate code generation more flexible by adding command-line options"
This reverts commit r238051.

It broke some bots:

http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190

llvm-svn: 238075
2015-05-23 00:22:44 +00:00
Philip Reames 6bbe9743d1 Correct a mistaken comment from 238071 [NFC]
llvm-svn: 238074
2015-05-23 00:05:43 +00:00
Rafael Espindola 5960cee1f5 Produce a single string table in a ELF .o
Normally an ELF .o has two string tables, one for symbols, one for section
names.

With the scheme of naming sections like ".text.foo" where foo is a symbol,
there is a big potential saving in using a single one.

Building llvm+clang+lld with master and with this patch the results were:

master:                          193,267,008 bytes
patch:                           186,107,952 bytes
master non unique section names: 183,260,192 bytes
patch non unique section names:  183,118,632 bytes

So using non usique saves 10,006,816 bytes, and the patch saves 7,159,056 while
still using distinct names for the sections.

llvm-svn: 238073
2015-05-22 23:58:30 +00:00
Philip Reames 7c78ef7dd9 Extend EarlyCSE to handle basic cases from JumpThreading and CVP
This patch extends EarlyCSE to take advantage of the information that a controlling branch gives us about the value of a Value within this and dominated basic blocks. If the current block has a single predecessor with a controlling branch, we can infer what the branch condition must have been to execute this block. The actual change to support this is downright simple because EarlyCSE's existing scoped hash table logic deals with most of the complexity around merging.

The patch actually implements two optimizations.
1) The first is analogous to JumpThreading in that it enables EarlyCSE's CSE handling to fold branches which are exactly redundant due to a previous branch to branches on constants. (It doesn't actually replace the branch or change the CFG.) This is pretty clearly a win since it enables substantial CFG simplification before we start trying to inline.
2) The second is analogous to CVP in that it exploits the knowledge gained to replace dominated *uses* of the original value. EarlyCSE does not otherwise reason about specific uses, so this is the more arguable one. It does enable further simplication and constant folding within the rest of the visit by EarlyCSE.

In both cases, the added code only handles the easy dominance based case of each optimization. The general case is deferred to the existing passes.

Differential Revision: http://reviews.llvm.org/D9763

llvm-svn: 238071
2015-05-22 23:53:24 +00:00
David Majnemer 4c3753c4d4 [InstCombine] Don't eagerly propagate nsw for A*B+A*C => A*(B+C)
InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C).
This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then
nothing in the LHS overflows, but the multiplication in RHS overflows.

We need to first make sure that we won't multiple by INT_SMAX + 1.

Test case `add_of_mul` contributed by Sanjoy Das.

This fixes PR23635.

Differential Revision: http://reviews.llvm.org/D9629

llvm-svn: 238066
2015-05-22 23:02:11 +00:00
Ahmed Bougacha 236f9040d0 [AArch64][CGP] Sink zext feeding stxr/stlxr into the same block.
The usual CodeGenPrepare trickery, on a target-specific intrinsic.
Without this, the expansion of atomics will usually have the zext
be hoisted out of the loop, defeating the various patterns we have
to catch this precise case.

Differential Revision: http://reviews.llvm.org/D9930

llvm-svn: 238054
2015-05-22 21:37:17 +00:00
Rafael Espindola 95ee81daf6 Relax these tests a bit.
It is not relevant where in the string table the name is located.

llvm-svn: 238053
2015-05-22 21:37:13 +00:00
Ahmed Bougacha 3d2d9d1d91 [AArch64] Robustize atomic cmpxchg test a little more. NFC.
We changed the test to test non-constant values in r238049.
We can also use CHECK-NEXT to be a little stricter.

llvm-svn: 238052
2015-05-22 21:35:14 +00:00
Sanjay Patel ba2ba80302 make reciprocal estimate code generation more flexible by adding command-line options
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238051
2015-05-22 21:10:06 +00:00
Ahmed Bougacha df94265963 [AArch64] Robustize atomic cmpxchg test. NFC.
Constants are easy to get right the wrong way.

llvm-svn: 238049
2015-05-22 21:08:15 +00:00
Quentin Colombet 494eb606cd Reapply r238011 with a fix for the trap instruction.
The problem was that I slipped a change required for shrink-wrapping, namely I
used getFirstTerminator instead of the getLastNonDebugInstr that was here before
the refactoring, whereas the surrounding code is not yet patched for that.

Original message:
[X86] Refactor the prologue emission to prepare for shrink-wrapping.

- Add a late pass to expand pseudo instructions (tail call and EH returns).
 Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.

NFC.

Related to <rdar://problem/20821487>

llvm-svn: 238035
2015-05-22 18:10:47 +00:00
Bill Schmidt e26236eed9 [PPC64] Add support for clrbhrb, mfbhrbe, rfebb.
This patch adds support for the ISA 2.07 additions involving the
branch history rolling buffer and event-based branching.  These will
not be used by typical applications, so built-in support is not
required.  They will only be available via inline assembly.

Assembly/disassembly tests are included in the patch.

llvm-svn: 238032
2015-05-22 16:44:10 +00:00
Rafael Espindola 62a07cb59b Stop inventing symbol sizes.
MachO and COFF quite reasonably only define the size for common symbols.

We used to try to figure out the "size" by computing the gap from one symbol to
the next.

This would not be correct in general, since a part of a section can belong to no
visible symbol (padding, private globals).

It was also really expensive, since we would walk every symbol to find the size
of one.

If a caller really wants this, it can sort all the symbols once and get all the
gaps ("size") in O(n log n) instead of O(n^2).

On MachO this also has the advantage of centralizing all the checks for an
invalid n_sect.

llvm-svn: 238028
2015-05-22 15:43:00 +00:00
Rafael Espindola 0d85d10747 Detect invalid section indexes when we first read them.
We still detect the same errors, but now we do it earlier.

llvm-svn: 238024
2015-05-22 14:59:27 +00:00
John Brawn c815a969c7 [ARM] Fix typo in subtarget feature list for 7em triple
The list of subtarget features for the 7em triple contains 't2xtpk',
which actually disables that subtarget feature. Correct that to
'+t2xtpk' and test that the instructions enabled by that feature do
actually work.

Differential Revision: http://reviews.llvm.org/D9936

llvm-svn: 238022
2015-05-22 14:16:22 +00:00
Rafael Espindola f7cfed4bff Fix llvm-nm -S option.
It is explicitly documented to have no effect on object formats where symbols
don't have sizes.

llvm-svn: 238019
2015-05-22 13:28:35 +00:00
Rafael Espindola 4fb845f031 Make this test stricter. NFC.
llvm-svn: 238018
2015-05-22 13:17:31 +00:00
NAKAMURA Takumi 263b27997d Revert r237954, "Resubmit r237708 (MIR Serialization: print and parse LLVM IR using MIR format)."
It brought cyclic dependencies between LLVMCodeGen and LLVMMIR.

llvm-svn: 238007
2015-05-22 07:17:07 +00:00
David Majnemer 1503258157 [InstSimplify] Handle some overflow intrinsics in InstSimplify
This change does a few things:
- Move some InstCombine transforms to InstSimplify
- Run SimplifyCall from within InstCombine::visitCallInst
- Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0.

llvm-svn: 237995
2015-05-22 03:56:46 +00:00
Philip Reames b47b9c2b2b [LICM] Sinking doesn't involve the preheader
PR23608 pointed out that using the preheader to gain a context instruction isn't always legal because a loop might not have a preheader.  When looking into that, I realized that using the preheader to determine legality for sinking is questionable at best.  Given no test covers that case and the original commit didn't seem to intend it, I restructured the code to only ask context sensative queries for hoising of loads and stores.  This is effectively a partial revert of 237593.

llvm-svn: 237985
2015-05-22 02:14:05 +00:00
Hans Wennborg 2ecb8dc39c Revert r236894 "[BasicAA] Fix zext & sext handling"
This seems to have caused PR23626: Clang miscompiles webkit's base64 decoder

llvm-svn: 237984
2015-05-22 01:27:37 +00:00
Peter Collingbourne 7e814d100b Revert r237590, "ARM: allow jump tables to be placed as constant islands."
Caused a miscompile of the Android port of Chromium, details
forthcoming.

llvm-svn: 237972
2015-05-21 23:20:55 +00:00
Jingyue Wu 4fc97f6df8 [NaryReassoc] reassociate GEP for CSE
Summary:
x = &a[i];
y = &a[i + j];

=>

y = x + j;

along with some refactoring work such as extracting method
findClosestMatchingDominator.

Depends on D9786 which provides the ScalarEvolution::getGEPExpr interface.

Test Plan: nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9802

llvm-svn: 237971
2015-05-21 23:17:30 +00:00
David Majnemer 27e89ba24c [InstCombine] X - 0 is equal to X, not undef
A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform
into undef instead of %X.

This fixes PR23624.

llvm-svn: 237968
2015-05-21 23:04:21 +00:00
Chad Rosier ce8e5abbaf [AArch64] Enhance the load/store optimizer with target-specific alias analysis.
Phabricator: http://reviews.llvm.org/D9863
llvm-svn: 237963
2015-05-21 21:36:46 +00:00
Keno Fischer c780e8ebcc Make it easier to use DwarfContext with MCJIT
Summary:
This supersedes http://reviews.llvm.org/D4010, hopefully properly
dealing with the JIT case and also adds an actual test case.
DwarfContext was basically already usable for the JIT (and back when
we were overwriting ELF files it actually worked out of the box by
accident), but in order to resolve relocations correctly it needs
to know the load address of the section.
Rather than trying to get this out of the ObjectFile or requiring
the user to create a new ObjectFile just to get some debug info,
this adds the capability to pass in that info directly.
As part of this I separated out part of the LoadedObjectInfo struct
from RuntimeDyld, since it is now required at a higher layer.

Reviewers: lhames, echristo

Reviewed By: echristo

Subscribers: vtjnash, friss, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D6961

llvm-svn: 237961
2015-05-21 21:24:32 +00:00
Alex Lorenz c37baf82a9 Resubmit r237708 (MIR Serialization: print and parse LLVM IR using MIR format).
This commit is a 2nd attempt at committing the initial MIR serialization patch.
The first commit (r237708) made the incremental buildbots unstable and was 
reverted in r237730. The original commit didn't add a terminating null 
character to the LLVM IR source which was passed to LLParser, and this 
sometimes caused the test 'llvmIR.mir' to fail with a parsing error because 
the LLVM IR source didn't have a null character immediately after the end 
and thus LLLexer encountered some garbage characters that ultimately caused 
the error.

This commit also includes the other test fixes I committed in
r237712 (llc path fix) and r237723 (remove target triple) which
also got reverted in r237730.

--Original Commit Message--

MIR Serialization: print and parse LLVM IR using MIR format.

This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR 
using the MIR format. This pass is then added as a last pass when a 
'stop-after' option is used in llc. The new library adds the initial 
functionality for parsing of MIR files as well. This commit also 
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616

llvm-svn: 237954
2015-05-21 20:54:45 +00:00
Bill Schmidt e13ac91c5d [PPC64] Handle vpkudum mask pattern correctly when vpkudum isn't available
My recent patch to add support for ISA 2.07 vector pack/unpack
instructions didn't properly check for availability of the vpkudum
instruction when recognizing it as a special vector shuffle case.
This causes us to leave the vector shuffle in place (rather than
converting it to a vector permute) so that it can be recognized later
as a vpkudum, but that pattern is invalid for processors prior to
POWER8.  Thus LLVM crashes with an "unable to select" message.  We
observed this since one of our buildbots is configured to generate
code for a POWER7.

This patch fixes the problem by checking for availability of the
vpkudum instruction during custom lowering of vector shuffles.

I've added a test case variant for the vpkudum pattern when the
instruction isn't available.

llvm-svn: 237952
2015-05-21 20:48:49 +00:00
Adrian Prantl 1f599f9f65 IR / debug info: Add a DWOId field to DICompileUnit,
so DWARF skeleton CUs can be expression in IR. A skeleton CU is a
(typically empty) DW_TAG_compile_unit that has a DW_AT_(GNU)_dwo_name and
a DW_AT_(GNU)_dwo_id attribute. It is used to refer to external debug info.

This is a prerequisite for clang module debugging as discussed in
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html.
In order to refer to external types stored in split DWARF (dwo) objects,
such as clang modules, we need to emit skeleton CUs, which identify the
dwarf object (i.e., the clang module) by filename (the SplitDebugFilename)
and a hash value, the dwo_id.

This patch only contains the IR changes. The idea is that a CUs with a
non-zero dwo_id field will be emitted together with a DW_AT_GNU_dwo_name
and DW_AT_GNU_dwo_id attribute.

http://reviews.llvm.org/D9488
rdar://problem/20091852

llvm-svn: 237949
2015-05-21 20:37:30 +00:00
Hal Finkel 3b3c9c3e44 [PPC/LoopUnrollRuntime] Don't avoid high-cost trip count computation on the PPC/A2
On X86 (and similar OOO cores) unrolling is very limited, and even if the
runtime unrolling is otherwise profitable, the expense of a division to compute
the trip count could greatly outweigh the benefits. On the A2, we unroll a lot,
and the benefits of unrolling are more significant (seeing a 5x or 6x speedup
is not uncommon), so we're more able to tolerate the expense, on average, of a
division to compute the trip count.

llvm-svn: 237947
2015-05-21 20:30:23 +00:00
Nemanja Ivanovic f02def6cbc Add support for VSX scalar single-precision arithmetic in the PPC target
http://reviews.llvm.org/D9891
Following up on the VSX single precision loads and stores added earlier, this
adds support for elementary arithmetic operations on single precision values
in VSX registers. These instructions utilize the new VSSRC register class.
Instructions added:
xsaddsp
xsdivsp
xsmulsp
xsresp
xsrsqrtesp
xssqrtsp
xssubsp

llvm-svn: 237937
2015-05-21 19:32:49 +00:00
Elena Demikhovsky 4aed59fc89 AVX-512: Enabled SSE intrinsics on AVX-512.
Predicate UseAVX depricates pattern selection on AVX-512.
This predicate is necessary for DAG selection to select EVEX form.
But mapping SSE intrinsics to AVX-512 instructions is not ready yet.
So I replaced UseAVX with HasAVX for intrinsics patterns.

llvm-svn: 237903
2015-05-21 14:01:32 +00:00
Artur Pilipenko 31619a847e Fix memory-dereferenceable.ll test
One of the testcases introduced by D9365 had incorrect !dereferenceable metadata on load. It must fail but it doesn't due to incorrect order of CHECK/CHECK-NOT commands in test. Fixed both.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D9877

llvm-svn: 237897
2015-05-21 12:51:38 +00:00
Simon Pilgrim e054199354 [X86][SSE] Improve support for 128-bit vector sign extension
This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization).

It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner.

Differential Revision: http://reviews.llvm.org/D9848

llvm-svn: 237885
2015-05-21 10:05:03 +00:00
Toma Tabacu 1e0a69f222 [mips] [IAS] Add 2 missing CHECK directives for fixups in mips-expansions.s.
llvm-svn: 237884
2015-05-21 10:04:39 +00:00
Hal Finkel d249736572 [TableGen] Resolve complex def names inside multiclasses
We had not been trying hard enough to resolve def names inside multiclasses
that had complex concatenations, etc. Now we'll try harder.

Patch by Amaury Sechet!

llvm-svn: 237877
2015-05-21 04:32:56 +00:00
Ahmed Bougacha 97876fa894 [MemCpyOpt] Do move the memset, but look at its dest's dependencies.
In effect a partial revert of r237858, which was a dumb shortcut.
Looking at the dependencies of the destination should be the proper
fix: if the new memset would depend on anything other than itself,
the transformation isn't correct.

llvm-svn: 237874
2015-05-21 01:43:39 +00:00
Ahmed Bougacha 5e0f425c27 [MemCpyOpt] Don't move the memset when optimizing memset+memcpy.
Fixes PR23599, another miscompile introduced by r235232: when there is
another dependency on the destination of the created memset (i.e., the
part of the original destination that the memcpy doesn't depend on)
between the memcpy and the original memset, we would insert the created
memset after the memcpy, and thus after the other dependency.

Instead, insert the created memset right after the old one.

llvm-svn: 237858
2015-05-20 23:55:16 +00:00
Andrew Kaylor a6c5b9682e [WinEH] C++ EH state numbering fixes
Differential Revision: http://reviews.llvm.org/D9787

llvm-svn: 237854
2015-05-20 23:22:24 +00:00
Reid Kleckner 2632f0df48 [WinEH] Store pointers to the LSDA in the exception registration object
We aren't yet emitting the LSDA yet, so this will still fail to
assemble.

llvm-svn: 237852
2015-05-20 23:08:04 +00:00
Hans Wennborg a8f8df5dd2 Revert r237828 "[X86] Remove unused node after morphing it from shr to and."
This caused assertions during DAG combine: PR23601.

llvm-svn: 237843
2015-05-20 22:31:55 +00:00
Davide Italiano 141b2891cb [Target/ARM] Only enable OptimizeBarrierPass at -O1 and above.
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.

Discussed with Tim Northover on IRC.

llvm-svn: 237837
2015-05-20 21:40:38 +00:00
Benjamin Kramer a74480d1eb [X86] Remove unused node after morphing it from shr to and.
In some cases it won't get cleaned up properly leading to crashes
downstream. PR23353.

Based on a patch by Davide Italiano.

llvm-svn: 237828
2015-05-20 20:10:26 +00:00
James Molloy 2b21a7cf36 Reapply r237539 with a fix for the Chromium build.
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.

> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.

llvm-svn: 237821
2015-05-20 18:41:25 +00:00
Pawel Bylica 8011da9628 Fix icmp lowering
Summary:
During icmp lowering it can happen that a constant value can be larger than expected (see the code around the change).
APInt::getMinSignedBits() must be checked again as the shift before can change the constant sign to positive.
I'm not sure it is the best fix possible though.

Test Plan: Regression test included.

Reviewers: resistor, chandlerc, spatel, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9147

llvm-svn: 237812
2015-05-20 17:21:09 +00:00
Alexey Samsonov 0ebbe74b73 Temporary delete the test while we're investigating crashes in LLVMObject it causes.
llvm-svn: 237809
2015-05-20 17:01:06 +00:00
Elena Demikhovsky f61727d880 AVX-512: fixed algorithm of building vectors of i1 elements
fixed extract-insert i1 element,
load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage.
added a bunch of new tests.

llvm-svn: 237793
2015-05-20 14:32:03 +00:00
Daniel Sanders 69c6008e49 Revert r237789 - [mips] The naming convention for private labels is ABI dependant.
It works, but I've noticed that I missed several callers of createMCAsmInfo()
and many don't have a TargetMachine to provide.

llvm-svn: 237792
2015-05-20 14:18:59 +00:00
Daniel Sanders 196390e8af [mips] Fix ehframe-indirect.ll test.
Summary:
-check-prefix replaces the default CHECK prefix rather than adding to it and
must be explicitly re-added.

Also added the N32 cases.

Reviewers: petarj

Reviewed By: petarj

Subscribers: tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9668

llvm-svn: 237790
2015-05-20 13:19:19 +00:00
Daniel Sanders b718eca643 [mips] The naming convention for private labels is ABI dependant.
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.

MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: jfb, sandeep, llvm-commits, rafael

Differential Revision: http://reviews.llvm.org/D9821

llvm-svn: 237789
2015-05-20 13:16:42 +00:00
Igor Laevsky 423bc9ec4c [StatepointLowering] Support of the gc.relocates for invoke statepoints.
This change implements support for lowering of the gc.relocates tied to the invoke statepoint.
This is acomplished by storing frame indices of the lowered values in "StatepointRelocatedValues" map inside FunctionLoweringInfo instead of storing them in per-basic block structure StatepointLowering.
After this change StatepointLowering is used only during "LowerStatepoint" call and it is not necessary to store it as a field in SelectionDAGBuilder anymore.

Differential Revision: http://reviews.llvm.org/D7798

llvm-svn: 237786
2015-05-20 11:37:25 +00:00
David Majnemer 402c5def11 [X86] Implement the local-exec TLS model for Windows targets
We know that _tls_index is zero for local-exec TLS variables because
they are always defined in the executable.

llvm-svn: 237772
2015-05-20 04:45:26 +00:00
Swaroop Sridhar 665bc9c936 Add a GCStrategy for CoreCLR
This change adds a new GC strategy for supporting the CoreCLR runtime.

This strategy is currently identical to Statepoint-example GC, 
but is necessary for several upcoming changes specific to CoreCLR, such as:

1. Base-pointers not explicitly reported for interior pointers
2. Different format for stack-map encoding
3. Location of Safe-point polls: polls are only needed before loop-back edges and before tail-calls (not needed at function-entry)
4. Runtime specific handshake between calls to managed/unmanaged functions.

llvm-svn: 237753
2015-05-20 01:07:23 +00:00
Philip Reames d97cdf28e6 [PlaceSafepoints] Stop special casing some intrinsics
We were special casing a handful of intrinsics as not needing a safepoint before them.  After running into another valid case - memset - I took a closer look and realized that almost no intrinsics need to have a safepoint poll before them.  Restructure the code to make that apparent so that we stop hitting these bugs.  The only intrinsics which need a safepoint poll before them are ones which can run arbitrary code.

llvm-svn: 237744
2015-05-19 23:40:11 +00:00
Hans Wennborg 2f21b8760e Revert r237539: "Reapply r237520 with another fix for infinite looping"
This caused PR23583.

llvm-svn: 237739
2015-05-19 23:06:30 +00:00
Alex Lorenz de1970fe66 Revert r237708 (MIR serialization) - incremental buildbots became unstable.
The incremental buildbots entered a pass-fail cycle where during the fail
cycle one of the tests from this commit fails for an unknown reason. I
have reverted this commit and will investigate the cause of this problem.

llvm-svn: 237730
2015-05-19 21:41:28 +00:00
Alex Lorenz edde50331d Fix MIR testcase committed in r237708 - remove target triple.
Remove the target specific triple and datalayout from the 
llvm IR in the MIR testcase file.

llvm-svn: 237723
2015-05-19 20:51:48 +00:00
Alexey Samsonov bf19a578e6 [DWARF parser] Add basic support for DWZ DWARF multifile extensions.
This change implements basic support for DWARF alternate sections
proposal: http://www.dwarfstd.org/ShowIssue.php?issue=120604.1&type=open

LLVM tools now understand new forms: DW_FORM_GNU_ref_alt and
DW_FORM_GNU_strp_alt, which are used as references to .debug_info and
.debug_str sections respectively, stored in a separate file, and
possibly shared between different executables / shared objects.

llvm-dwarfdump and llvm-symbolizer don't yet know how to access this
alternate debug file (usually pointed by .gnu_debugaltlink section),
but they can at lease properly parse and dump regular files, which
refer to it.

This change should fix crashes of llvm-dwarfdump and llvm-symbolizer on
files produced by running "dwz" tool. Such files are already installed
on some modern Linux distributions.

llvm-svn: 237721
2015-05-19 20:29:28 +00:00
Sanjoy Das f999547d11 Dereferenceable, dereferenceable_or_null metadata for loads
Summary:
Introduce dereferenceable, dereferenceable_or_null metadata for loads
with the same semantic as corresponding attributes.

This patch depends on http://reviews.llvm.org/D9253

Patch by Artur Pilipenko!

Reviewers: hfinkel, sanjoy, reames

Reviewed By: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9365

llvm-svn: 237720
2015-05-19 20:10:19 +00:00
Alex Lorenz 06e3bf670f Fix llc path in MIR testcases committed in r237708.
I've committed testcases with local llc path by mistake.

llvm-svn: 237712
2015-05-19 18:45:41 +00:00
Filipe Cabecinhas fc93be21ea Change a reachable unreachable to a fatal error.
Summary:
Also tagged a FIXME comment, and added information about why it breaks.

Bug found using AFL fuzz.

Reviewers: rafael, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9729

llvm-svn: 237709
2015-05-19 18:18:10 +00:00
Alex Lorenz c5e0d4d146 MIR Serialization: print and parse LLVM IR using MIR format.
This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR 
using the MIR format. This pass is then added as a last pass when a 
'stop-after' option is used in llc. The new library adds the initial 
functionality for parsing of MIR files as well. This commit also 
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616

llvm-svn: 237708
2015-05-19 18:17:39 +00:00
Igor Laevsky e03171863d [RewriteStatepointsForGC] For some values (like gep's and bitcasts) it's cheaper to clone them after statepoint than to emit proper relocates for them. This change implements this logic. There is alredy similar optimization in CodeGenPrepare, but doing so during RewriteStatepointsForGC allows to capture more opprtunities such as relocates in loops and longer instruction chains.
Differential Revision: http://reviews.llvm.org/D9774

llvm-svn: 237701
2015-05-19 15:59:05 +00:00
Zoran Jovanovic dde61c00c3 [mips][microMIPSr6] Implement NOR, OR, ORI, XOR and XORI instructions
Differential Revision: http://reviews.llvm.org/D8800

llvm-svn: 237697
2015-05-19 14:12:55 +00:00
Zoran Jovanovic 299fed6b7d [mips][microMIPSr6] Implement AND and ANDI instructions
Differential Revision: http://reviews.llvm.org/D8772

llvm-svn: 237696
2015-05-19 13:32:31 +00:00
Daniel Sanders c8cd58fa26 [mips] Correct and improve special-case shuffle instructions.
Summary:
The documentation writes vectors highest-index first whereas LLVM-IR writes
them lowest-index first. As a result, instructions defined in terms of
left_half() and right_half() had the halves reversed.

In addition to correcting them, they have been improved to allow shuffles
that use the same operand twice or in reverse order. For example, ilvev
used to accept masks of the form:
  <0, n, 2, n+2, 4, n+4, ...>
but now accepts:
  <0, 0, 2, 2, 4, 4, ...>
  <n, n, n+2, n+2, n+4, n+4, ...>
  <0, n, 2, n+2, 4, n+4, ...>
  <n, 0, n+2, 2, n+4, 4, ...>

One further improvement is that splati.[bhwd] is now the preferred instruction
for splat-like operations. The other special shuffles are no longer used
for splats. This lead to the discovery that <0, 0, ...> would not cause
splati.[hwd] to be selected and this has also been fixed.

This fixes the enc-3des test from the test-suite on Mips64r6 with MSA.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9660

llvm-svn: 237689
2015-05-19 12:24:52 +00:00
Zoran Jovanovic 3825261572 [mips][microMIPSr6] Implement DIV, DIVU, MOD and MODU instructions
Differential Revision: http://reviews.llvm.org/D8769

llvm-svn: 237685
2015-05-19 11:21:37 +00:00
Michael Kuperstein e88a021bf4 [X86] ABI change for x86-32: pass 3 vector arguments in-register instead of 4, except on Darwin.
This changes the ABI used on 32-bit x86 for passing vector arguments. 
Historically, clang passes the first 4 vector arguments in-register, and additional vector arguments on the stack, regardless of platform. That is different from the behavior of gcc, icc, and msvc, all of which pass only the first 3 arguments in-register.
The 3-register convention is documented, unofficially, in Agner's calling convention guide, and, officially, in the recently released version 1.0 of the i386 psABI.

Darwin is kept as is because the OS X ABI Function Call Guide explicitly documents the current (4-register) behavior.

This fixes PR21510

Differential revision: http://reviews.llvm.org/D9644

llvm-svn: 237682
2015-05-19 11:06:56 +00:00
Filipe Cabecinhas 32af542194 [BitcodeReader] Error out if we read an invalid function argument type
Bug found with AFL fuzz.

llvm-svn: 237650
2015-05-19 01:21:06 +00:00
Filipe Cabecinhas f3fa99c48e [BitcodeReader] It's a malformed block if CodeLenWidth is too big
Bug found with AFL fuzz.

llvm-svn: 237646
2015-05-19 00:34:17 +00:00
Reid Kleckner 7bc6b6210c Re-land r237175: [X86] Always return the sret parameter in eax/rax ...
This reverts commit r237210.

Also fix X86/complex-fca.ll to match the code that we used to generate
on win32 and now generate everwhere to conform to SysV.

llvm-svn: 237639
2015-05-18 23:35:09 +00:00
Jozef Kolek cc0c0fc926 [mips][microMIPSr6] Implement LSA instruction
This patch implements LSA instruction using mapping.

Differential Revision: http://reviews.llvm.org/D8919

llvm-svn: 237634
2015-05-18 23:12:10 +00:00
Filipe Cabecinhas 4708a02a78 [BitcodeReader] Make sure the type of the inserted value matches the type of the aggregate at those indices
Bug found with AFL-fuzz.

llvm-svn: 237628
2015-05-18 22:27:11 +00:00
Tim Northover d6223a2471 AArch64: work around ld64 bug more aggressively.
ld64 currently mishandles internal pointer relocations (i.e.
ARM64_RELOC_UNSIGNED referred to by section & offset rather than symbol). The
existing __cfstring clause was an early discovery and workaround for this, but
the problem is wider and we should avoid such relocations wherever possible for
now.

This code should be reverted to allowing internal relocations as soon as
possible.

PR23437.

llvm-svn: 237621
2015-05-18 22:07:20 +00:00
Filipe Cabecinhas 11bb8495f6 Extract the load/store type verification to a separate function.
Summary:
Added isLoadableOrStorableType to PointerType.

We were doing some checks in some places, occasionally assert()ing instead
of telling the caller. With this patch, I'm putting all type checking in
the same place for load/store type instructions, and verifying the same
thing every time.

I also added a check for load/store of a function type.

Applied extracted check to Load, Store, and Cmpxcg.

I don't have exhaustive tests for all of these, but all Error() calls in
TypeCheckLoadStoreInst are being tested (in invalid.test).

Reviewers: dblaikie, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9785

llvm-svn: 237619
2015-05-18 21:48:55 +00:00
Chen Li 6d8635a743 [Verifier] Assert gc_relocate always return a pointer type
Summary: Add an assertion in verifier.cpp to make sure gc_relocate relocate a gc pointer, and its return type has the same address space with the relocated pointer.

Reviewers: reames, AndyAyers, sanjoy, pgavlin

Reviewed By: pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9695

llvm-svn: 237605
2015-05-18 19:50:14 +00:00
Chen Li 74ca2a8777 [PlaceSafepoints] Assertion on that gc_result can not have preceding phis should only apply to invoke statepoint
Summary: When PlaceSafepoints pass replaces old return result with gc_result from statepoint, it asserts that gc_result can not have preceding phis in its parent block. This is only true on invoke statepoint, which terminates the block and puts its result at the beginning of the normal successor block. Call statepoint does not terminate the block and thus its result is in the same block with it. There should be no restriction on whether there are phis or not.

Reviewers: reames, igor-laevsky

Reviewed By: igor-laevsky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9803

llvm-svn: 237597
2015-05-18 19:02:25 +00:00
Sanjoy Das f8a0db50b2 Exploit dereferenceable_or_null attribute in LICM pass
Summary:
Allow hoisting of loads from values marked with dereferenceable_or_null
attribute. For values marked with the attribute perform
context-sensitive analysis to determine whether it's known-non-null or
not.

Patch by Artur Pilipenko!

Reviewers: hfinkel, sanjoy, reames

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9253

llvm-svn: 237593
2015-05-18 18:07:00 +00:00
Tim Northover 12c41af07c ARM: allow jump tables to be placed as constant islands.
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.

This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.

rdar://20813304

llvm-svn: 237590
2015-05-18 17:10:40 +00:00
James Y Knight c49e78851c Sparc: support the "set" synthetic instruction.
This pseudo-instruction expands into 'sethi' and 'or' instructions,
or, just one of them, if the other isn't necessary for a given value.

Differential Revision: http://reviews.llvm.org/D9089

llvm-svn: 237585
2015-05-18 16:43:33 +00:00
Hal Finkel 44b81ee40b Preserve the order of READ_REGISTER and WRITE_REGISTER
At the present time, we don't have a way to represent general dependency
relationships, so everything is represented using memory dependency. In order
to preserve the data dependency of a READ_REGISTER on WRITE_REGISTER, we need
to model WRITE_REGISTER as writing (which we had been doing) and model
READ_REGISTER as reading (which we had not been doing). Fix this, and also the
way that the chain operands were generated at the SDAG level.

Patch by Nicholas Paul Johnson, thanks! Test case by me.

llvm-svn: 237584
2015-05-18 16:42:10 +00:00
Oliver Stannard 6cb23465e0 Revert r237579, as it broke windows buildbots
llvm-svn: 237583
2015-05-18 16:39:16 +00:00
James Y Knight f7e7017281 Sparc: Support PSR, TBR, WIM read/write instructions.
Differential Revision: http://reviews.llvm.org/D8971

llvm-svn: 237582
2015-05-18 16:38:47 +00:00
James Y Knight 24060be73a Sparc: Add the "alternate address space" load/store instructions.
- Adds support for the asm syntax, which has an immediate integer
  "ASI" (address space identifier) appearing after an address, before
  a comma.

- Adds the various-width load, store, and swap in alternate address
  space instructions. (ldsba, ldsha, lduba, lduha, lda, stba, stha,
  sta, swapa)

This does not attempt to hook these instructions up to pointer address
spaces in LLVM, although that would probably be a reasonable thing to
do in the future.

Differential Revision: http://reviews.llvm.org/D8904

llvm-svn: 237581
2015-05-18 16:35:04 +00:00
James Y Knight 807563df22 Add support for the Sparc implementation-defined "ASR" registers.
(Note that register "Y" is essentially just ASR0).

Also added some test cases for divide and multiply, which had none before.

Differential Revision: http://reviews.llvm.org/D8670

llvm-svn: 237580
2015-05-18 16:29:48 +00:00
Oliver Stannard 0c553afe6a [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699

llvm-svn: 237579
2015-05-18 16:23:33 +00:00
Adam Nemet df3dc5b9ca [LoopAccesses] If shouldRetryWithRuntimeCheck, reset InterestingDependences
When dependence analysis encounters a non-constant distance between
memory accesses it aborts the analysis and falls back to run-time checks
only.  In this case we weren't resetting the array of dependences.

llvm-svn: 237574
2015-05-18 15:37:03 +00:00
Adam Nemet c3384320f2 [LoopAccesses] Rearrange printed lines in -analyze
"Store to invariant address..." is moved as the last line.  This is not
the prime result of the analysis.  Plus it simplifies some of the tests.

llvm-svn: 237573
2015-05-18 15:36:57 +00:00
Jozef Kolek cbb227b48d [mips][microMIPSr6] Implement ALIGN and AUI instructions
This patch implements ALIGN and AUI instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8782

llvm-svn: 237563
2015-05-18 11:44:30 +00:00
Elena Demikhovsky b8573cba02 AVX-512: Added intrinsics for ADDSS/D, MULSS/D, SUBSS/D, DIVSS/D
instructions. These intrinsics are comming with rounding mode.
Added intrinsics for MAXSS/D, MINSS/D - with and without  sae.

By Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 237560
2015-05-18 07:24:19 +00:00
Elena Demikhovsky 08ce53c0ea AVX-512: Added patterns for scalar-to-vector broadcast
llvm-svn: 237558
2015-05-18 07:06:23 +00:00
Elena Demikhovsky ad9c396838 AVX-512: Added VBROADCASTF64X4, VBROADCASTF64X2, VBROADCASTI32X8, and other instructions from this set
Added encoding tests.

llvm-svn: 237557
2015-05-18 06:42:57 +00:00
Hal Finkel 8340de142c [PowerPC] Add extra r2 read deps on @toc@l relocations
If some commits are happy, and some commits are sad, this is a sad commit. It
is sad because it restricts instruction scheduling to work around a binutils
linker bug, and moreover, one that may never be fixed. On 2012-05-21, GCC was
updated not to produce code triggering this bug, and now we'll do the same...

When resolving an address using the ELF ABI TOC pointer, two relocations are
generally required: one for the high part and one for the low part. Only
the high part generally explicitly depends on r2 (the TOC pointer). And, so,
we might produce code like this:

.Ltmp526:
        addis 3, 2, .LC12@toc@ha
.Ltmp1628:
        std 2, 40(1)
        ld 5, 0(27)
        ld 2, 8(27)
        ld 11, 16(27)
        ld 3, .LC12@toc@l(3)
        rldicl 4, 4, 0, 32
        mtctr 5
        bctrl
        ld 2, 40(1)

And there is nothing wrong with this code, as such, but there is a linker bug
in binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=18414) that will
misoptimize this code sequence to this:
        nop
        std     r2,40(r1)
        ld      r5,0(r27)
        ld      r2,8(r27)
        ld      r11,16(r27)
        ld      r3,-32472(r2)
        clrldi  r4,r4,32
        mtctr   r5
        bctrl
        ld      r2,40(r1)
because the linker does not know (and does not check) that the value in r2
changed in between the instruction using the .LC12@toc@ha (TOC-relative)
relocation and the instruction using the .LC12@toc@l(3) relocation.
Because it finds these instructions using the relocations (and not by
scanning the instructions), it has been asserted that there is no good way
to detect the change of r2 in between. As a result, this bug may never be
fixed (i.e. it may become part of the definition of the ABI). GCC was
updated to add extra dependencies on r2 to instructions using the @toc@l
relocations to avoid this problem, and we'll do the same here.

This is done as a separate pass because:
 1. These extra r2 dependencies are not really properties of the
    instructions, but rather due to a linker bug, and maybe one day we'll be
    able to get rid of them when targeting linkers without this bug (and,
    thus, keeping the logic centralized here will make that
    straightforward).
 2. There are ISel-level peephole optimizations that propagate the @toc@l
    relocations to some user instructions, and so the exta dependencies do
    not apply only to a fixed set of instructions (without undesirable
    definition replication).

The test case was reduced with the help of bugpoint, with minimal cleaning. I'm
looking forward to our upcoming MI serialization support, and with that, much
better tests can be created.

llvm-svn: 237556
2015-05-18 06:25:59 +00:00
James Molloy 53958e187a Reapply r237520 with another fix for infinite looping
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.

Fix and regression test added - third time lucky?

llvm-svn: 237539
2015-05-17 08:27:27 +00:00
Elena Demikhovsky a8200603d4 AVX-512: fixed extended load to 512-bit register
llvm-svn: 237537
2015-05-17 08:08:06 +00:00
Elena Demikhovsky 1d6a495d6d AVX-512: fixed a bug in mask operations - (i1 1) pattern
Filling k-reg with all-ones value was wrong,
(i1 1) should switch on only one bit in mask register

llvm-svn: 237536
2015-05-17 07:28:51 +00:00
James Molloy e8698ae3e1 Revert commits r237521 and r237520.
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.

llvm-svn: 237528
2015-05-16 21:27:14 +00:00
James Molloy 8ae1224fcf Update to r237520 - swap order of CHECK-NEXT lines.
... I'd copied the check-next lines from a previous test so they were
slightly wrong, and had managed to test the wrong source tree. D'oh!

llvm-svn: 237521
2015-05-16 13:26:25 +00:00
James Molloy b5aa200a33 Reapply r237453 with a fix for the test timeouts.
The test timeouts were due to instcombine fighting itself. Regression test added.
Original log message:

Canonicalize min/max expressions correctly.

This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
    %2 = sext i32 %a to i64
      %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
    %2 = select i1 %1, i32 %a, i32 0
      %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237520
2015-05-16 13:10:45 +00:00
Ahmed Bougacha f8fa3b8d4b [MemCpyOpt] Turn memcpy from just-memset'd source into memset.
There's no point in copying around constants, so, when all else fails,
we can still transform memcpy of memset into two independent memsets.

To quote the example, we can turn:
  memset(dst1, c, dst1_size);
  memcpy(dst2, dst1, dst2_size);
into:
  memset(dst1, c, dst1_size);
  memset(dst2, c, dst2_size);
When dst2_size <= dst1_size.

Like r235232 for copy constructors, this can occur in move constructors.

Differential Revision: http://reviews.llvm.org/D9682

llvm-svn: 237506
2015-05-16 01:32:26 +00:00
Ahmed Bougacha d2b8fc1f3a Remove dead code in testcase. NFC.
llvm-svn: 237501
2015-05-16 01:10:40 +00:00
Bill Schmidt 5ed84cdba8 [PPC64] Add vector pack/unpack support from ISA 2.07
This patch adds support for the following new instructions in the
Power ISA 2.07:

  vpksdss
  vpksdus
  vpkudus
  vpkudum
  vupkhsw
  vupklsw

These instructions are available through the vec_packs, vec_packsu,
vec_unpackh, and vec_unpackl built-in interfaces.  These are
lane-sensitive instructions, so the built-ins have different
implementations for big- and little-endian, and the instructions must
be marked as killing the vector swap optimization for now.

The first three instructions perform saturating pack operations.  The
fourth performs a modulo pack operation, which means it can be
represented with a vector shuffle, and conversely the appropriate
vector shuffles may cause this instruction to be generated.  The other
instructions are only generated via built-in support for now.

Appropriate tests have been added.

There is a companion patch to clang for the rest of this support.

llvm-svn: 237499
2015-05-16 01:02:12 +00:00
Filipe Cabecinhas 1c299d05e6 [BitcodeReader] Don't allow INSERTVAL/EXTRACTVAL with 0 indices
This would trigger an assertion later.

Bug found with AFL fuzz.

llvm-svn: 237494
2015-05-16 00:33:12 +00:00
Jingyue Wu 154eb5aa1d Add a speculative execution pass
Summary:
This is a pass for speculative execution of instructions for simple if-then (triangle) control flow. It's aimed at GPUs, but could perhaps be used in other contexts. Enabling this pass gives us a 1.0% geomean improvement on Google benchmark suites, with one benchmark improving 33%.

Credit goes to Jingyue Wu for writing an earlier version of this pass.

Patched by Bjarke Roune. 

Test Plan:
This patch adds a set of tests in test/Transforms/SpeculativeExecution/spec.ll
The pass is controlled by a flag which defaults to having the pass not run.

Reviewers: eliben, dberlin, meheff, jingyue, hfinkel

Reviewed By: jingyue, hfinkel

Subscribers: majnemer, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9360

llvm-svn: 237459
2015-05-15 17:54:48 +00:00
James Molloy 1675b4a57f Revert "Canonicalize min/max expressions correctly."
This reverts r237453 - it was causing timeouts on some bots. Reverting
while I investigate (it's probably InstCombine fighting itself...)

llvm-svn: 237458
2015-05-15 17:45:09 +00:00
Jingyue Wu 80a96d299a [SLSR] handle (B | i) * S
Summary:
Consider (B | i) * S as (B + i) * S if B and i have no bits set in
common.

Test Plan: @or in slsr-mul.ll

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9788

llvm-svn: 237456
2015-05-15 17:07:48 +00:00
James Molloy cfb0443af6 Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them.
The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for
most NEON datatypes - the big exclusion being v2i64.

llvm-svn: 237455
2015-05-15 16:15:57 +00:00
James Molloy 6edf0b4cd4 Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
  %2 = sext i32 %a to i64
  %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
  %2 = select i1 %1, i32 %a, i32 0
  %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237453
2015-05-15 16:10:59 +00:00
Simon Atanasyan eeb2fa9877 [llvm-readobj] Teach llvm-readobj to print PT_MIPS_ABIFLAGS program header
llvm-svn: 237451
2015-05-15 15:59:22 +00:00
Nemanja Ivanovic ce6211f7ff NFC - Test case invokes llc on a file rather than redirected from a file.
This has caused some local failures. Updating the test case to be more
like the majority of the similar test cases.
Committing on behalf of Hubert Tong (hstong@ca.ibm.com).

llvm-svn: 237449
2015-05-15 15:29:53 +00:00
James Molloy c0661aeaf8 [DependenceAnalysis] Fix for PR21585: collectUpperBound triggers asserts
collectUpperBound hits an assertion when the back edge count is wider then the desired type.

If that happens, truncate the backedge count.

Patch by Philip Pfaffe!

llvm-svn: 237439
2015-05-15 12:17:22 +00:00
Toma Tabacu a3d056fd4c [mips] [IAS] Fix expansion of negative 32-bit immediates for LI/DLI.
Summary:
To maintain compatibility with GAS, we need to stop treating negative 32-bit immediates as 64-bit values when expanding LI/DLI.
This currently happens because of sign extension.

To do this we need to choose the 32-bit value expansion for values which use their upper 33 bits only for sign extension (i.e. no 0's, only 1's).

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8662

llvm-svn: 237428
2015-05-15 09:42:11 +00:00
Sanjoy Das 2c2661456e [PlaceSafepoints] Fix a bug that came in with rL236672.
Transfer the calling convention from the invoke being replaced by
PlaceStatepoints to the new invoke to gc.statepoint created.  Add a test
case that would have caught this issue.

llvm-svn: 237414
2015-05-15 00:26:21 +00:00
Sanjoy Das 8045810c58 [PlaceSafepoints] Fix a bug that came in with rL236672.
rL236672 would generate all invoke statepoints with deopt args set to a
list containing the single element "0", instead of an empty list.

Also add a test case that would have caught this.

llvm-svn: 237413
2015-05-15 00:26:15 +00:00
Akira Hatanaka 60f21fdd50 Fix the check strings in a test case committed in r212455.
The access size (8, in this case) was missing in the function name that was
being checked.

llvm-svn: 237410
2015-05-15 00:12:26 +00:00
Jingyue Wu ca32190379 [ValueTracking] refactor: extract method haveNoCommonBitsSet
Summary:
Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in
InstCombine and SeparateConstOffsetFromGEP.

This patch also makes SeparateConstOffsetFromGEP more precise by passing
DominatorTree to computeKnownBits.

Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions

Reviewers: broune, meheff, majnemer

Reviewed By: majnemer

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9734

llvm-svn: 237407
2015-05-14 23:53:19 +00:00
Wei Mi bf727ba371 Add another InstCombine pass after LoopUnroll.
This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll.

Differential Revision: http://reviews.llvm.org/D9777

llvm-svn: 237395
2015-05-14 22:02:54 +00:00
Brendon Cahoon 7c8a3b0ef6 [Hexagon] Generate hardware loop for a vectorized loop
The induction variable in the vectorized loop wasn't
recognized properly, so a hardware loop wasn't generated.

Differential Revision: http://reviews.llvm.org/D9722

llvm-svn: 237388
2015-05-14 20:36:19 +00:00
Andrea Di Biagio 2905999c22 [ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'.
Function 'ConstantFoldScalarCall' (in ConstantFolding.cpp) works under the
wrong assumption that a call to 'convert.from.fp16' returns a value of
type 'float'.
However, intrinsic 'convert.from.fp16' can be overloaded; for example, we
can call 'convert.from.fp16.f64' to convert from half to double; etc.

Before this patch, the following example would have triggered an assertion
failure in opt (with -constprop):

```
define double @foo() {
entry:
  %0 = call double @llvm.convert.from.fp16.f64(i16 0)
  ret double %0
}
```

This patch fixes the problem in ConstantFolding.cpp. When folding a call to
convert.from.fp16, we perform a different kind of conversion based on the call
return type.

Added test 'Transform/ConstProp/convert-from-fp16.ll'.

Differential Revision: http://reviews.llvm.org/D9771

llvm-svn: 237377
2015-05-14 18:01:48 +00:00
Brendon Cahoon 485bea74ad [Hexagon] Remove dead constant assignment in hardware loop pass
After converting a loop to a hardware loop, the pass should remove
any unnecessary instructions from the old compare-and-branch
code. This patch removes a dead constant assignment that was
used in the compare instruction.

Differential Revision: http://reviews.llvm.org/D9720

llvm-svn: 237373
2015-05-14 17:31:40 +00:00
Toma Tabacu e625b5fc02 [mips] [IAS] Enforce .set nomacro.
Summary: When used, ".set nomacro" causes warning messages to be reported when we expand pseudo-instructions to multiple machine instructions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9564

llvm-svn: 237366
2015-05-14 14:51:32 +00:00
Brendon Cahoon 9376e9998e [Hexagon] Check for underflow/wrap in hardware loop pass
If the loop trip count may underflow or wrap, the compiler should
not generate a hardware loop since the trip count will be
incorrect.

llvm-svn: 237365
2015-05-14 14:15:08 +00:00
Toma Tabacu 772155cbc6 [mips] [IAS] Emit .set macro/nomacro.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9563

llvm-svn: 237363
2015-05-14 13:42:10 +00:00
Vasileios Kalintiris 70b744e4a1 [mips] Do not place users of $ra in the delay slot of call instructions.
Summary:
When we are trying to fill the delay slot of a call instruction, we must avoid
filler instructions that use the $ra register. This fixes the test
MultiSource/Applications/JM/lencod when we enable the forward delay slot filler.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9670

llvm-svn: 237362
2015-05-14 13:17:56 +00:00
Artyom Skrobov a70dfe18d3 Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases
No longer breaks SPEC2000/2006

llvm-svn: 237361
2015-05-14 12:59:46 +00:00
Adam Nemet 938d3d63d6 New Loop Distribution pass
Summary:
This implements the initial version as was proposed earlier this year
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html).
Since then Loop Access Analysis was split out from the Loop Vectorizer
and was made into a separate analysis pass.  Loop Distribution becomes
the second user of this analysis.

The pass is off by default and can be enabled
with -enable-loop-distribution.  There is currently no notion of
profitability; if there is a loop with dependence cycles, the pass will
try to split them off from other memory operations into a separate loop.

I decided to remove the control-dependence calculation from this first
version.  This and the issues with the PDT are actively discussed so it
probably makes sense to treat it separately.  Right now I just mark all
terminator instruction required which keeps identical CFGs for each
distributed loop.  This seems to be working pretty well for 456.hmmer
where even though there is an empty if-then block in the distributed
loop initially, it gets completely removed.

The pass keeps DominatorTree and LoopInfo updated.  I've tested this
with -loop-distribute-verify with the testsuite where we distribute ~90
loops.  SimplifyLoop is violated in some cases and I have a FIXME
covering this.

Reviewers: hfinkel, nadav, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8831

llvm-svn: 237358
2015-05-14 12:05:18 +00:00
Toma Tabacu ec1de82213 [mips] [IAS] Warn when LA is used with a 64-bit symbol.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9295

llvm-svn: 237356
2015-05-14 10:53:40 +00:00
Elena Demikhovsky d5b3e376d2 AVX-512: Added i1 type handling for calling conventions.
i1 type is a legal type on AVX-512 and can be passed as parameter or return value.
i1 is promoted to i8 on return and to i32 for call arguments (i8 is also promoted to i32 here).
The result code is similar to the previous X86 targets, where i1 is allways promoted to i8.

llvm-svn: 237350
2015-05-14 09:04:45 +00:00
Andy Ayers 9e5c851419 Don't omit the constant when computing a cross-section relative relocation.
Differential Revision: http://reviews.llvm.org/D9692

llvm-svn: 237327
2015-05-14 01:10:41 +00:00
Ahmed Bougacha 6402ad27c0 [CodeGen] Use standard -not gnueabi- naming for f16 libcalls on Darwin.
Other targets probably should as well.  Since r237161, compiler-rt has
both, but I don't see why anything other than gnueabi would use a
gnueabi naming scheme.

llvm-svn: 237324
2015-05-14 01:00:51 +00:00
Alex Lorenz a22b250c6f YAML: Implement block scalar parsing.
This commit implements the parsing of YAML block scalars.
Some code existed for it before, but it couldn't parse block
scalars.

This commit adds a new yaml node type to represent the block
scalar values. 

This commit also deletes the 'spec-09-27' and 'spec-09-28' tests
as they are identical to the test file 'spec-09-26'.

This commit introduces 3 new utility functions to the YAML scanner
class: `skip_s_space`, `advanceWhile` and `consumeLineBreakIfPresent`.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D9503

llvm-svn: 237314
2015-05-13 23:10:51 +00:00
Douglas Katzman 6dc1397298 [X86] Fix PR23271 - RIP-relative decoding bug in disassembler.
Differential Revision: http://reviews.llvm.org/D9110

llvm-svn: 237310
2015-05-13 22:44:52 +00:00
Justin Bogner d0ceebf160 InstrProf: Fix display of large numbers in llvm-cov
llvm-cov was truncating numbers that were larger than a particular
fixed width, which is as confusing as it is useless. Instead, we use
engineering notation with SI prefix for magnitude.

llvm-svn: 237307
2015-05-13 22:41:48 +00:00
Sanjoy Das ba74e645d8 [PlaceSafepoints] New attributes for patchable statepoints.
Summary:
This patch teaches the PlaceSafepoints pass about two `CallSite`
function attributes:

 * "statepoint-id": if the string value of this attribute can be parsed
   as an integer, then it is propagated to the ID parameter of the
   statepoint created.

 * "statepoint-num-patch-bytes": if the string value of this attribute
   can be parsed as an integer, then it is propagated to the `num patch
   bytes` parameter of the statepoint created.

This change intentionally does not assert on a malformed value for these
attributes, given that they're not "official" attributes.

Reviewers: reames, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9735

llvm-svn: 237286
2015-05-13 20:11:31 +00:00
Jingyue Wu c74e33bffe [NaryReassociate] avoid running forever
Avoid running forever by checking we are not reassociating an expression into
the same form.

Tested with @avoid_infinite_loops in nary-add.ll

llvm-svn: 237269
2015-05-13 18:12:24 +00:00
Brendon Cahoon d11c92a41c [Hexagon] Generate loop1 instruction for nested loops
loop1 is for the outer loop and loop0 is for the inner loop.

Differential Revision: http://reviews.llvm.org/D9680

llvm-svn: 237266
2015-05-13 17:56:03 +00:00
Diego Novillo ffc84e378a Add function entry counts from sample profiles.
This patch uses the new function profile metadata "function_entry_count"
to annotate entry counts from sample profiles.

In a sampling profile, the total samples collected at the function entry
are an approximation for the number of times that function was invoked.

llvm-svn: 237265
2015-05-13 17:04:29 +00:00
Diego Novillo 2567f3d0fb Add function entry count metadata.
Summary:
This adds three Function methods to handle function entry counts:
setEntryCount() and getEntryCount().

Entry counts are stored under the MD_prof metadata node with the name
"function_entry_count". They are unsigned 64 bit values set by profilers
(instrumentation and sample profiler changes coming up).

Added documentation for new profile metadata and tests.

Reviewers: dexonsmith, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9628

llvm-svn: 237260
2015-05-13 15:13:45 +00:00
Brendon Cahoon 254e656862 [Hexagon] Generate hardware loop when loop has a critical edge
The hardware loop pass should try to generate a hardware loop
instruction when the original loop has a critical edge.

Differential Revision: http://reviews.llvm.org/D9678

llvm-svn: 237258
2015-05-13 14:54:24 +00:00
Jozef Kolek 6fec325d10 [mips][microMIPSr6] Implement CLO and CLZ instructions
This patch implements CLO and CLZ instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8553

llvm-svn: 237257
2015-05-13 14:18:11 +00:00
Silviu Baranga 780a3b3be7 Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006
llvm-svn: 237256
2015-05-13 14:03:18 +00:00
Toma Tabacu d0a7ff2ed7 [mips] [IAS] Unify common functionality of LA and LI.
Summary: A side-effect of this is that LA gains proper handling of unsigned and positive signed 16-bit immediates and more accurate error messages.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9290

llvm-svn: 237255
2015-05-13 13:56:16 +00:00
Artyom Skrobov b526681e08 [AArch64] Codegen VMAX/VMIN for safe math cases
llvm-svn: 237247
2015-05-13 12:01:09 +00:00
Michael Kuperstein c3434b390d Reverting r237234, "Use std::bitset for SubtargetFeatures"
The buildbots are still not satisfied.
MIPS and ARM are failing (even though at least MIPS was expected to pass).

llvm-svn: 237245
2015-05-13 10:28:46 +00:00
Toma Tabacu 9fb2ff71ca [mips] [IAS] Merge the micromips-expressions.s test into expr1.s. NFC.
Summary: Also did some minor reformatting in the resulting test.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9702

llvm-svn: 237242
2015-05-13 09:53:53 +00:00
Sergey Dmitrouk 46c4f02848 [DebugInfo] Debug locations for constant SD nodes
Several updates for [DebugInfo] Add debug locations to constant SD nodes (r235989).
Includes:

 *  re-enabling the change (disabled recently);
 *  missing change for FP constants;
 *  resetting debug location of constant node if it's used more than at one place
    to prevent emission of wrong locations in case of coalesced constants;
 *  a couple of additional tests.

Now all look ups in CSEMap are wrapped by additional method.

Comment in D9084 suggests that debug locations aren't useful for "target constants",
so there might be one more change related to this API (namely, dropping debug
locations for getTarget*Constant methods).

Differential Revision: http://reviews.llvm.org/D9604

llvm-svn: 237237
2015-05-13 08:58:03 +00:00
Michael Kuperstein aba4a34ef2 Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first two times this was committed (r229831, r233055), it caused several buildbot failures. 
At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed.

llvm-svn: 237234
2015-05-13 08:27:08 +00:00
Elena Demikhovsky 1b2f2f1b37 AVX-512: fixed a bug in encoding of VPSRAQ instrcution,
added a bunch of encoding tests.

llvm-svn: 237232
2015-05-13 07:35:05 +00:00
Jingyue Wu 4b6125d788 [SLSR] handles non-canonicalized Mul candidates
such as (2 + B) * S.

Tested by @non_canonicalized in slsr-mul.ll

llvm-svn: 237216
2015-05-13 00:03:17 +00:00
Sanjoy Das a1d39ba940 [Statepoints] Support for "patchable" statepoints.
Summary:
This change adds two new parameters to the statepoint intrinsic, `i64 id`
and `i32 num_patch_bytes`.  `id` gets propagated to the ID field
in the generated StackMap section.  If the `num_patch_bytes` is
non-zero then the statepoint is lowered to `num_patch_bytes` bytes of
nops instead of a call (the spill and reload code remains unchanged).
A non-zero `num_patch_bytes` is useful in situations where a language
runtime requires complete control over how a call is lowered.

This change brings statepoints one step closer to patchpoints.  With
some additional work (that is not part of this patch) it should be
possible to get rid of `TargetOpcode::STATEPOINT` altogether.

PlaceSafepoints generates `statepoint` wrappers with `id` set to
`0xABCDEF00` (the old default value for the ID reported in the stackmap)
and `num_patch_bytes` set to `0`.  This can be made more sophisticated
later.

Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9546

llvm-svn: 237214
2015-05-12 23:52:24 +00:00
Saleem Abdulrasool ee13fbe848 CodeGen: ignore DEBUG_VALUE nodes in KILL tagging
DEBUG_VALUE nodes do not take part in code generation.  Ignore them when
performing KILL updates.  Addresses PR23486.

llvm-svn: 237211
2015-05-12 23:36:18 +00:00
Chandler Carruth 942fba95e2 Revert r237175: [X86] Always return the sret parameter in eax/rax ...
This commit broke an x86 test and the bots have been broken for well
over an hour now so I'm just reverting.

llvm-svn: 237210
2015-05-12 23:34:27 +00:00