Commit Graph

75437 Commits

Author SHA1 Message Date
David Majnemer 40157d5c4d InstCombine: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) == 0 ? X ^ C : X  into  X | C
(X & C) != 0 ? X ^ C : X  into  X & ~C

llvm-svn: 222871
2014-11-27 07:25:21 +00:00
NAKAMURA Takumi 5d50b1a432 Add LLVMObject to LLVMExecutionEngine.
llvm-svn: 222869
2014-11-27 06:36:22 +00:00
David Majnemer c6a5e1dd4f InstSimplify: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) ? X & ~C : X  into  X & ~C
(X & C) ? X : X & ~C  into  X
(X & C) ? X | C : X  into  X
(X & C) ? X : X | C  into  X | C

llvm-svn: 222868
2014-11-27 06:32:46 +00:00
Lang Hames a5cd950c73 [MCJIT] Remove the local symbol table from RuntimeDlyd - it's not needed.
All symbols have to be stored in the global symbol to enable
cross-rtdyld-instance linking, so the local symbol table content is
redundant.

llvm-svn: 222867
2014-11-27 05:40:13 +00:00
Lang Hames 7ea98e142b [MCJIT] Replace JITEventListener::anchor (temporarily removed in r222861), and
move GDBRegistrationListener into ExecutionEngine to avoid layering violation.

llvm-svn: 222864
2014-11-27 01:41:16 +00:00
Lang Hames a662d16314 [MCJIT] Remove JITEventListener's anchor until I can determine the right place
to put it. This should unbreak the Mips bots.

llvm-svn: 222861
2014-11-27 00:15:28 +00:00
Lang Hames bf6952c87a [MCJIT] Move get-any-symbol-load-address logic out of RuntimeDyld and into
RuntimeDyldChecker.

RuntimeDyld instances should only provide lookup for locally defined
symbols.

llvm-svn: 222859
2014-11-27 00:12:28 +00:00
David Majnemer 5468e86469 Revert "Added inst combine transforms for single bit tests from Chris's note"
This reverts commit r210006, it miscompiled libapr which is used in who
knows how many projects.

A test has been added to ensure that we don't regress again.

I'll work on a rewrite of what the optimization was trying to do later.

llvm-svn: 222856
2014-11-26 23:00:38 +00:00
Rui Ueyama 98fe58a3a7 Object/COFF: Fix off-by-one error for object having lots of relocations
llvm-objdump printed out an error message for this off-by-one error,
but because it always exits with 0 whether or not it found an error,
the test (llvm-objdump/coff-many-relocs.test) succeeded.
I made llvm-objdump exit with EXIT_FAILURE when an error is found.

llvm-svn: 222852
2014-11-26 22:17:25 +00:00
Matt Arsenault fcdddf9602 R600/SI: Use ZeroOrNegativeOneBooleanContent
This sort of doesn't matter since the setcc type is i1, but
this previously was using the default UndefinedBooleanContent. This
makes it more consistent with R600. This enables more optimizations
which typically give up on UndefinedBooleanContent. For example,
there is already a special case target DAG combine for
setcc + sext which can be eliminated in favor of what the generic
DAG combiner can do if it assumes boolean values are sign extended.
Since -1 is an inline immediate, using it is basically free and the
backend already uses it when a boolean value is needed in a wider type.

llvm-svn: 222850
2014-11-26 21:23:15 +00:00
Colin LeMahieu 6e0f9f8d61 [Hexagon] Adding cmp* immediate form instructions.
llvm-svn: 222849
2014-11-26 19:43:12 +00:00
Jozef Kolek 315e7eca1b [mips][microMIPS] Implement disassembler support for 16-bit instructions LBU16, LHU16, LW16, SB16, SH16 and SW16
Differential Revision: http://reviews.llvm.org/D6405

llvm-svn: 222847
2014-11-26 18:56:38 +00:00
Colin LeMahieu 31abe33726 [Hexagon] Adding and64, or64, and xor64 instructions.
llvm-svn: 222846
2014-11-26 18:55:59 +00:00
Matt Arsenault b935089576 R600/SI: Create e64 versions of and/or/xor in SILowerI1Copies
This fixes moving boolean constants into registers before operating
on them. They get permuted and shrunk down to e32 anyway later. This
is a temporary fix until the patch that removes these pseudos is
committed.

llvm-svn: 222844
2014-11-26 18:18:28 +00:00
Lang Hames 40db56cc6d [MCJIT] Fix missing return statement.
llvm-svn: 222841
2014-11-26 17:21:41 +00:00
Lang Hames b5c7b1ff83 [MCJIT] Reapply r222828 and r222810-r222812 with fix for MSVC move-op issues.
llvm-svn: 222840
2014-11-26 16:54:40 +00:00
Aaron Ballman 9fb411431d Reverting r222828 and r222810-r222812 as they broke the build on Windows.
http://bb.pgr.jp/builders/ninja-clang-i686-msc17-R/builds/11753

llvm-svn: 222833
2014-11-26 15:27:39 +00:00
Aaron Ballman c53ead03ce Removing a spurious semicolon; NFC
llvm-svn: 222830
2014-11-26 13:55:55 +00:00
Evgeniy Stepanov 2a783392d4 Add missing "override".
Fixes compilation failure in r222810.

llvm-svn: 222828
2014-11-26 12:26:03 +00:00
Will Newton 40f08faa70 Update AArch64 ELF relocations to ABI 1.0
This mostly entails adding relocations, however there are a couple of
changes to existing relocations:

1. R_AARCH64_NONE is defined to be zero rather than 256

R_AARCH64_NONE has been defined to be zero for a long time elsewhere
e.g. binutils and glibc since the submission of the AArch64 port in
2012 so this is required for compatibility.

2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21

I don't think there is any way for relocation names to leak out of LLVM
so this should not break anything.

Tested with check-all with no regressions.

llvm-svn: 222821
2014-11-26 10:49:18 +00:00
Elena Demikhovsky 905a5a606f AVX-512: Scalar ERI intrinsics
including SAE mode and memory operand.
Added AVX512_maskable_scalar template, that should cover all scalar instructions in the future.

The main difference between AVX512_maskable_scalar<> and AVX512_maskable<> is using X86select instead of vselect.
I need it, because I can't create vselect node for MVT::i1 mask for scalar instruction.

http://reviews.llvm.org/D6378

llvm-svn: 222820
2014-11-26 10:46:49 +00:00
Lang Hames 7cacb9b540 [MCJIT] Re-enable GDB registration (temporarily disabled in r222811), but check
that we actually have an object to register first.

For MachO objects, RuntimeDyld::LoadedObjectInfo::getObjectForDebug returns an
empty OwningBinary<ObjectFile> which was causing crashes in the GDB registration
code.

llvm-svn: 222812
2014-11-26 07:39:03 +00:00
Lang Hames 08a01ae372 [MCJIT] Temporarily disable automatic JIT debugger registration.
The RuntimeDyld cleanup patch r222810 turned on GDB registration for MachO
objects. I expected this to be harmless, but it seems to have broken on
MacsOS. Temporarily disabling debugger registration while I dig in to what's
gone wrong.

llvm-svn: 222811
2014-11-26 07:25:26 +00:00
Lang Hames 829a19ae74 [MCJIT] Clean up RuntimeDyld's quirky object-ownership/modification scheme.
Previously, when loading an object file, RuntimeDyld (1) took ownership of the
ObjectFile instance (and associated MemoryBuffer), (2) potentially modified the
object in-place, and (3) returned an ObjectImage that managed ownership of the
now-modified object and provided some convenience methods. This scheme accreted
over several years as features were tacked on to RuntimeDyld, and was both
unintuitive and unsafe (See e.g. http://llvm.org/PR20722).

This patch fixes the issue by removing all ownership and in-place modification
of object files from RuntimeDyld. Existing behavior, including debugger
registration, is preserved.

Noteworthy changes include:

(1) ObjectFile instances are now passed to RuntimeDyld by const-ref.
(2) The ObjectImage and ObjectBuffer classes have been removed entirely, they
    existed to model ownership within RuntimeDyld, and so are no longer needed.
(3) RuntimeDyld::loadObject now returns an instance of a new class,
    RuntimeDyld::LoadedObjectInfo, which can be used to construct a modified
    object suitable for registration with the debugger, following the existing
    debugger registration scheme.
(4) The JITRegistrar class has been removed, and the GDBRegistrar class has been
    re-written as a JITEventListener.

This should fix http://llvm.org/PR20722 .

llvm-svn: 222810
2014-11-26 06:53:26 +00:00
Craig Topper c50d64b07b Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.
llvm-svn: 222801
2014-11-26 00:46:26 +00:00
Simon Pilgrim 371417db34 [X86][SSE] Improvements to byte shift shuffle matching
Since (v)pslldq / (v)psrldq instructions resolve to a single input argument it is useful to match it much earlier than we currently do - this prevents more complicated shuffles (notably insertion into a zero vector) matching before it.

Differential Revision: http://reviews.llvm.org/D6409

llvm-svn: 222796
2014-11-25 22:34:59 +00:00
Colin LeMahieu b3d08bb44b [Hexagon] Adding add64 and sub64 instructions.
llvm-svn: 222795
2014-11-25 22:15:44 +00:00
Colin LeMahieu 6f6c4ff1fc Reverting 222792
llvm-svn: 222793
2014-11-25 21:39:57 +00:00
Colin LeMahieu aaf33928ee [Hexagon] Adding compare with immediate instructions.
llvm-svn: 222792
2014-11-25 21:30:28 +00:00
Colin LeMahieu 6f352b03a4 [Hexagon] Adding NOP encoding bits.
llvm-svn: 222791
2014-11-25 21:23:07 +00:00
Matt Arsenault 3896a0a99c R600/SI: Only use one DEBUG()
llvm-svn: 222789
2014-11-25 21:03:22 +00:00
Cameron McInally 9b7c15a364 [AVX512] Add 512b integer shift by variable intrinsics and patterns.
llvm-svn: 222786
2014-11-25 20:41:51 +00:00
Colin LeMahieu e83bc7476f [Hexagon] Adding C2_mux instruction.
llvm-svn: 222784
2014-11-25 20:20:09 +00:00
Craig Topper edb091185d Remove space before tab in all AVX512 mnemonic strings.
llvm-svn: 222778
2014-11-25 20:11:23 +00:00
Colin LeMahieu 902157c249 [Hexagon] Replacing cmp* instructions with ones that contain encoding bits.
llvm-svn: 222771
2014-11-25 18:20:52 +00:00
Hans Wennborg 45172aceb3 LazyValueInfo: Actually re-visit partially solved block-values in solveBlockValue()
If solveBlockValue() needs results from predecessors that are not already
computed, it returns false with the intention of resuming when the dependencies
have been resolved. However, the computation would never be resumed since an
'overdefined' result had been placed in the cache, preventing any further
computation.

The point of placing the 'overdefined' result in the cache seems to have been
to break cycles, but we can check for that when inserting work items in the
BlockValue stack instead. This makes the "stop and resume" mechanism of
solveBlockValue() work as intended, unlocking more analysis.

Using this patch shaves 120 KB off a 64-bit Chromium build on Linux.

I benchmarked compiling bzip2.c at -O2 but couldn't measure any difference in
compile time.

Tests by Jiangning Liu from r215343 / PR21238, Pete Cooper, and me.

Differential Revision: http://reviews.llvm.org/D6397

llvm-svn: 222768
2014-11-25 17:23:05 +00:00
Rafael Espindola c81c3f554c Set the body of a new struct as soon as it is created.
This changes the order in which different types are passed to get, but
one order is not inherently better than the other.

The main motivation is that this simplifies linkDefinedTypeBodies now that
it is only linking "real" opaque types. It is also means that we only have to
call it once and that we don't need getImpl.

A small change in behavior is that we don't copy type names when resolving
opaque types. This is an improvement IMHO, but it can be added back if
desired. A test is included with the new behavior.

llvm-svn: 222764
2014-11-25 15:33:40 +00:00
Evgeniy Stepanov 28cacae2f1 [msan] Annotate zlib functions for MemorySanitizer.
Mark destination buffer in zlib::compress and zlib::decompress as fully
initialized.

When building LLVM with system zlib and MemorySanitizer instrumentation,
MSan does not observe memory writes in zlib code and erroneously considers
zlib output buffers as uninitialized, resulting in false use-of-uninitialized
memory reports. This change helps MSan understand the state of that memory
and prevents such reports.

llvm-svn: 222763
2014-11-25 15:24:07 +00:00
Rafael Espindola 290e2cc520 Misc style fixes. NFC.
This just reduces the noise in the next patch.

llvm-svn: 222761
2014-11-25 14:35:53 +00:00
Joerg Sonnenberger cf0ea262b1 Reapply 222538 and update tests to explicitly request small code model
and PIC:

Allow FDE references outside the +/-2GB range supported by PC relative
offsets for code models other than small/medium. For JIT application,
memory layout is less controlled and can result in truncations
otherwise.

Patch from Akos Kiss.

Differential Revision: http://reviews.llvm.org/D6079

llvm-svn: 222760
2014-11-25 13:37:55 +00:00
Rafael Espindola 5e7f98071a Remove a bit of duplicated code.
Exactly the same checks are present in areTypesIsomorphic.

This might have been a premature performance optimization. I cannot reproduce
any slowdown with this patch.

llvm-svn: 222758
2014-11-25 13:19:46 +00:00
Chandler Carruth 4c8cf4f7bc Revert r222746: That commit did not update any tests and caused two R600
tests to start failing.

Original commit log: R600/SI: Disable commutativity for MIN/MAX_LEGACY

llvm-svn: 222753
2014-11-25 10:50:41 +00:00
Zoran Jovanovic b554bba90f [mips][micromips] Use call instructions with short delay slots
Differential Revision: http://reviews.llvm.org/D6338

llvm-svn: 222752
2014-11-25 10:50:00 +00:00
Chandler Carruth 816d26fe5e [InstCombine] Change LLVM To canonicalize toward the value type being
stored rather than the pointer type.

This change is analogous to r220138 which changed the canonicalization
for loads. The rationale is the same: memory does not have a type,
operations (and thus the values they produce) have a type. We should
match that type as closely as possible rather than reading some form of
semantics into the pointer type.

With this change, loads and stores should no longer be made with
nonsensical types for the values that tehy load and store. This is
particularly important when trying to match specific loaded and stored
types in the process of doing other instcombines, which is what led me
down this twisty maze of miscanonicalization.

I've put quite some effort into looking through IR to find places where
LLVM's optimizer was being unreasonably conservative in the face of
mismatched load and store types, however it is possible (let's say,
likely!) I have missed some. If you see regressions here, or from
r220138, the likely cause is some part of LLVM failing to cope with load
and store types differing. Test cases appreciated, it is important that
we root all of these out of LLVM.

llvm-svn: 222748
2014-11-25 10:09:51 +00:00
Marek Olsak f1449e99b9 R600/SI: Disable commutativity for MIN/MAX_LEGACY
llvm-svn: 222746
2014-11-25 09:49:23 +00:00
Chandler Carruth 1a3c2c414c Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite
clearly only exactly equal width ptrtoint and inttoptr casts are no-op
casts, it says so right there in the langref. Make the code agree.

Original log from r220277:
Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.

To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.

These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.

I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.

llvm-svn: 222739
2014-11-25 08:20:27 +00:00
Matt Arsenault e335fd343e R600/SI: Fix allocating flat_scr_lo / flat_scr_hi
Only the super register flat_scr was marked as reserved,
so in some cases with high register usage it would still
try to allocate the subregisters.

llvm-svn: 222737
2014-11-25 07:53:06 +00:00
David Majnemer c7353b5890 COFF: Add back an assertion that is superseded by r222124
llvm-svn: 222735
2014-11-25 07:43:14 +00:00
Rafael Espindola 8f14471e81 Use a range loop. NFC.
llvm-svn: 222730
2014-11-25 06:16:27 +00:00
Rafael Espindola c84f608b08 Style fix: don't indent inside a namemespace.
llvm-svn: 222729
2014-11-25 06:11:24 +00:00
Rafael Espindola f010311166 Remove a nested anonymous namespace.
llvm-svn: 222728
2014-11-25 06:07:51 +00:00
Rafael Espindola 86911440c2 Fix overly aggressive type merging.
If we find out that two types are *not* isomorphic, we learn nothing about
opaque sub types in both the source and destination.

llvm-svn: 222727
2014-11-25 05:59:24 +00:00
Rafael Espindola e96d7eb8bd Link the type of aliases.
They are not more or less "well typed" than GlobalVariables.

llvm-svn: 222725
2014-11-25 04:43:59 +00:00
Rafael Espindola d2a13a2ec6 Don't repeat name in comment or duplicate comment. NFC.
llvm-svn: 222724
2014-11-25 04:28:31 +00:00
Rafael Espindola c8a476ee11 Use range loops. NFC.
llvm-svn: 222723
2014-11-25 04:26:19 +00:00
Juergen Ributzka eb67bd8d74 [FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching.
The pattern matching failed to recognize all instances of "-1", because when
comparing against "-1" we didn't use an APInt of the same bitwidth.

This commit fixes this and also adds inverse versions of the conditon to catch
more cases.

llvm-svn: 222722
2014-11-25 04:16:15 +00:00
David Majnemer bd9ce4ea51 InstSimplify: Handle some simple tautological comparisons
This handles cases where we are comparing a masked value against itself.
The analysis could be further improved by making it recursive but such
expense is not currently justified.

llvm-svn: 222716
2014-11-25 02:55:48 +00:00
David Blaikie cb2818fa76 Revert "unique_ptrify LLVMContextImpl::CAZConstants"
Missed the complexities of how these elements are destroyed.

This reverts commit r222714.

llvm-svn: 222715
2014-11-25 02:26:22 +00:00
David Blaikie 899b85a556 unique_ptrify LLVMContextImpl::CAZConstants
llvm-svn: 222714
2014-11-25 02:13:54 +00:00
Hal Finkel 5901676581 [PowerPC] Add the 'attn' instruction
The attn instruction is not part of the Power ISA, but is documented in the A2
user manual, and is accepted by the GNU assembler for the A2 and the POWER4+.
Reported as part of PR21650.

llvm-svn: 222712
2014-11-25 00:30:11 +00:00
Hal Finkel 360f213d03 [PowerPC] Implement combineRepeatedFPDivisors
This does not matter on newer cores (where we can use reciprocal estimates in
fast-math mode anyway), but for older cores this allows us to generate better
fast-math code where we have multiple FDIVs with a common divisor.

llvm-svn: 222710
2014-11-24 23:45:21 +00:00
Philip Reames 00d3b279cb Factor check for the assume intrinsic out of checks in computeKnownBitsFromAssume
We were matching against the assume intrinsic in every check.  Since we know that it must be an assume, this is just wasted work.  Somewhat surprisingly, matching an intrinsic id is actually relatively expensive.  It devolves to a string construction and comparison in Function::isIntrinsic.

I originally spotted this because it showed up in a performance profile of my compiler.  I've since discovered a separate issue which seems to be the actual root cause, but this is minor perf goodness regardless.  

I'm likely to follow up with another change to factor out the comparison matching.  There's no need to match the compare instruction in every single one of the tests.

Differential Revision: http://reviews.llvm.org/D6312

llvm-svn: 222709
2014-11-24 23:44:28 +00:00
Philip Reames 059ecbfb58 Incorporate review comments from r221742
This change implements the comment and style changes Sean requested during post commit review with r221742.  Sorry for the delay.

llvm-svn: 222707
2014-11-24 23:24:24 +00:00
Matt Arsenault 238ff1ad1e Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons
llvm-svn: 222705
2014-11-24 23:15:18 +00:00
Rafael Espindola 64ac12d626 Remove the unused FindUsedTypes pass.
It was dead since r134829.

llvm-svn: 222684
2014-11-24 20:53:26 +00:00
Rafael Espindola ffbfcf29f2 Add and use Type::subtypes. NFC.
llvm-svn: 222682
2014-11-24 20:44:36 +00:00
Chad Rosier ba0e0664ff [AArch64] Fix clobber computation in A57LoadBalancing pass.
Extremely difficult to reproduce, so no test case included.
PR21637

llvm-svn: 222677
2014-11-24 18:57:58 +00:00
Colin LeMahieu 287c4e1762 Removing unused variable.
llvm-svn: 222676
2014-11-24 18:55:32 +00:00
Kostya Serebryany 4cadd4afa0 [asan/coverage] change the way asan coverage instrumentation is done: instead of setting the guard to 1 in the generated code, pass the pointer to guard to __sanitizer_cov and set it there. No user-visible functionality change expected
llvm-svn: 222675
2014-11-24 18:49:53 +00:00
Ulrich Weigand a69bcd5ed0 [PowerPC] Fix PR 21652 - copy st_other bits on symbol assignment
When processing an assignment in the integrated assembler that sets
a symbol to the value of another symbol, we need to copy the st_other
bits that encode the local entry point offset.

Modeled after MipsTargetELFStreamer::emitAssignment handling of the
ELF::STO_MIPS_MICROMIPS flag.

llvm-svn: 222672
2014-11-24 18:09:47 +00:00
Paul Robinson c38deee807 More long path name support on Windows, this time in program execution.
Allows long paths for the executable and redirected stdin/stdout/stderr.
Addresses PR21563.

llvm-svn: 222671
2014-11-24 18:05:29 +00:00
Colin LeMahieu 397a25e7cd [Hexagon] Adding asrh instruction, removing unused multiclasses.
llvm-svn: 222670
2014-11-24 18:04:42 +00:00
Colin LeMahieu 3b3197ef95 [Hexagon] Adding aslh instruction.
llvm-svn: 222668
2014-11-24 17:44:19 +00:00
Colin LeMahieu 098256c5e6 [Hexagon] Adding zxth instruction.
llvm-svn: 222662
2014-11-24 17:11:34 +00:00
Colin LeMahieu bb7d6f5514 [Hexagon] Adding zxtb instruction.
llvm-svn: 222660
2014-11-24 16:48:43 +00:00
David Majnemer 8e6f6a98b5 InstCombine: Don't create an unused instruction
We would create an instruction but not inserting it.
Not inserting the unused instruction would lead us to verification
failure.

This fixes PR21653.

llvm-svn: 222659
2014-11-24 16:41:13 +00:00
Jozef Kolek 11bdb8bf33 [mips][microMIPS] Fix JRADDIUSP instruction
Fix JRADDIUSP instruction, remove delay slot flag because this instruction
doesn't have delay slot.

Differential Revision: http://reviews.llvm.org/D6365

llvm-svn: 222658
2014-11-24 16:14:10 +00:00
Jozef Kolek e8c9d1eaf7 [mips][microMIPS] Implement LBU16, LHU16, LW16, SB16, SH16 and SW16 instructions
Differential Revision: http://reviews.llvm.org/D5122

llvm-svn: 222653
2014-11-24 14:39:13 +00:00
Jozef Kolek 1904fa2197 [mips][microMIPS] Implement 16-bit instructions registers including ZERO instead of S0
Implement microMIPS 16-bit instructions register set: $0, $2-$7 and $17.

Differential Revision: http://reviews.llvm.org/D5780

llvm-svn: 222652
2014-11-24 14:25:53 +00:00
Aaron Ballman 41580deaf1 Removing a variable that is initialized but never read. The original author has been alerted to the warning, in case this variable is meant to be used. Fixes -Werror builds in the meantime.
llvm-svn: 222649
2014-11-24 14:03:16 +00:00
Jozef Kolek ea22c4cfbb [mips][microMIPS] Implement disassembler support for 16-bit instructions
With the help of new method readInstruction16() two bytes are read and
decodeInstruction() is called with DecoderTableMicroMips16, if this fails
four bytes are read and decodeInstruction() is called with
DecoderTableMicroMips32.

Differential Revision: http://reviews.llvm.org/D6149

llvm-svn: 222648
2014-11-24 13:29:59 +00:00
Andrea Di Biagio 23e2cfa834 [X86] Improved target specific combine on VSELECT dag nodes.
This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to
convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1.
On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd.

Also, removed a target specific combine that performed a premature lowering of
VSELECT nodes to target specific MOVSS/MOVSD nodes.

llvm-svn: 222647
2014-11-24 12:23:15 +00:00
David Majnemer b2a6e7458d InstCombine: Don't assume DataLayout is always available
We tried to get the result of DataLayout::getLargestLegalIntTypeSize but
we didn't have a DataLayout.  This resulted in opt crashing.

This fixes PR21651.

llvm-svn: 222645
2014-11-24 07:26:20 +00:00
Elena Demikhovsky 6de72e5b0c Converted back to Unix format (after my last commit 222632)
llvm-svn: 222636
2014-11-23 15:21:53 +00:00
Michael Kuperstein 9ef90647b9 [X86] Fixes bug in build_vector v4x32 lowering
r222375 made some improvements to build_vector lowering of v4x32 and v4xf32 into an insertps, but it missed a case where:

1. A single extracted element is used twice.
2. The lower of the two non-zero indexes should be preserved, and the higher should be used for the dest mask.

This caused a crash, since the source value for the insertps ends-up uninitialized.

Differential Revision: http://reviews.llvm.org/D6377

llvm-svn: 222635
2014-11-23 13:09:06 +00:00
Craig Topper 8c5128bf1b Add missing override keywords.
llvm-svn: 222634
2014-11-23 09:40:13 +00:00
Elena Demikhovsky 9e5089a938 Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191

llvm-svn: 222632
2014-11-23 08:07:43 +00:00
Matt Arsenault 2a495975ed R600: Fix extloads of i1 on R600/Evergreen
llvm-svn: 222631
2014-11-23 02:57:54 +00:00
Matt Arsenault 28638f1e2c R600: Fix assert on copy of an i1 on pre-SI
i1 is not a legal type on Evergreen, so this combine proceeded
and tried to produce a bitcast between i1 and i8.

llvm-svn: 222630
2014-11-23 02:57:52 +00:00
David Majnemer fb3805576b InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2)
llvm-svn: 222625
2014-11-22 20:00:41 +00:00
David Majnemer ec6e481bc5 InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y)
llvm-svn: 222624
2014-11-22 20:00:38 +00:00
David Majnemer fa4699e65f InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C)
llvm-svn: 222623
2014-11-22 20:00:34 +00:00
Simon Pilgrim a279410ede Tidied up target triple OS detection. NFC
Use Triple::isOS*() helper functions where possible.

llvm-svn: 222622
2014-11-22 19:12:10 +00:00
David Majnemer a3aeb15613 InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2)
llvm-svn: 222620
2014-11-22 18:16:54 +00:00
Chandler Carruth 5852b1f4c2 [x86] Teach the vector shuffle yet another step of canonicalization.
No functionality changed yet, but this will prevent subsequent patches
from having to handle permutations of various interleaved shuffle
patterns.

llvm-svn: 222614
2014-11-22 09:18:53 +00:00
David Majnemer 546f81064c InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y
llvm-svn: 222613
2014-11-22 08:57:02 +00:00
David Majnemer 8279a7506d InstCombine: Propagate NSW for -X * -Y -> X * Y
llvm-svn: 222612
2014-11-22 07:25:19 +00:00
David Majnemer 4efa9ff8ca InstSimplify: Simplify (sub 0, X) -> X if it's NUW
This is a generalization of the X - (0 - Y) -> X transform.

llvm-svn: 222611
2014-11-22 07:15:16 +00:00
David Majnemer 83484fdb8b InstCombine: Silence a parenthesis warning
llvm-svn: 222609
2014-11-22 06:09:28 +00:00
David Majnemer 80c8f627db InstCombine: Preserve nsw when folding X*(2^C) -> X << C
llvm-svn: 222606
2014-11-22 04:52:55 +00:00
David Majnemer fd4a6d2b7a InstCombine: Preserve nsw/nuw for ((X << C2)*C1) -> (X * (C1 << C2))
llvm-svn: 222605
2014-11-22 04:52:52 +00:00
David Majnemer 027bc80928 InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V)
llvm-svn: 222604
2014-11-22 04:52:38 +00:00
Gerolf Hoflehner ec6217c929 [InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence)
Fixes the self-host fail. Note that this commit activates dominator
analysis in the combiner by default (like the original commit did).

llvm-svn: 222590
2014-11-21 23:36:44 +00:00
Joerg Sonnenberger 02b13a8d9b Fix transformation of add with pc argument to adr for non-immediate
arguments.

llvm-svn: 222587
2014-11-21 22:39:34 +00:00
Kostya Serebryany 60ef25bd54 [asan] remove old experimental code
llvm-svn: 222586
2014-11-21 22:34:29 +00:00
Tom Stellard 91c7ef529d R600/SI: Add an s_mov_b32 to patterns which use the M0RegClass
We need to use a s_mov_b32 rather than a copy, so that CSE will
eliminate redundant moves to the m0 register.

llvm-svn: 222584
2014-11-21 22:31:46 +00:00
Tom Stellard a99ada528c R600/SI: Emit s_mov_b32 m0, -1 before every DS instruction
This s_mov_b32 will write to a virtual register from the M0Reg
class and all the ds instructions now take an extra M0Reg explicit
argument.

This change is necessary to prevent issues with the scheduler
mixing together instructions that expect different values in the m0
registers.

llvm-svn: 222583
2014-11-21 22:31:44 +00:00
Tom Stellard 6596ba7933 R600/SI: Add SIFoldOperands pass
This pass attempts to fold the source operands of mov and copy
instructions into their uses.

llvm-svn: 222581
2014-11-21 22:06:37 +00:00
Jozef Kolek 3b8ddb665b [mips][microMIPS] This patch implements functionality in MIPS delay slot
filler such as if delay slot filler have to put NOP instruction into the
delay slot of microMIPS BEQ or BNE instruction which uses the register $0,
then instead of emitting NOP this instruction is replaced by the corresponding
microMIPS compact branch instruction, i.e. BEQZC or BNEZC.

Differential Revision: http://reviews.llvm.org/D3566

llvm-svn: 222580
2014-11-21 22:04:35 +00:00
Tom Stellard fb0011cf1a R600/SI: Mark s_mov_b32 and s_mov_b64 as rematerializable
llvm-svn: 222579
2014-11-21 22:00:16 +00:00
Colin LeMahieu 310991c66f [Hexagon] Adding sxth instruction.
llvm-svn: 222577
2014-11-21 21:54:59 +00:00
Colin LeMahieu 91ffec908f [Hexagon] Adding sxtb instruction. Renaming some identically named classes that will be removed after converting referencing defs.
llvm-svn: 222575
2014-11-21 21:35:52 +00:00
Kostya Serebryany ea2cb6f616 [asan] add statistic counter to dynamic alloca instrumentation
llvm-svn: 222573
2014-11-21 21:25:18 +00:00
Colin LeMahieu e88447d8de [Hexagon] Removing SUB_rr and replacing with A2_sub.
llvm-svn: 222571
2014-11-21 21:19:18 +00:00
Tim Northover 6e8e506b3e Remove duplication of relocation names in lib/Object/ELFYAML.cpp
We can now use the ELF relocation .def files to create the mapping
of relocation numbers to names and avoid having to duplicate the
list of relocations.

Patch by Will Newton.

llvm-svn: 222567
2014-11-21 20:16:09 +00:00
Tim Northover 242785c031 Remove duplication of relocation names in lib/Object/ELF.cpp
We can now use the ELF relocation .def files to create the mapping
of relocation numbers to names and avoid having to duplicate the
list of relocations.

Patch by Will Newton.

llvm-svn: 222566
2014-11-21 20:16:07 +00:00
Manman Ren f0a582bada Debug Info: revert r222195, r222210 and r222239.
This is no longer needed after David's fix at r222377 + r222485.
rdar://18958417

llvm-svn: 222563
2014-11-21 19:55:23 +00:00
Roman Divacky d2b9a1b890 Disable header duplication at -Oz in loop-rotate pass.
llvm-svn: 222562
2014-11-21 19:53:24 +00:00
Manman Ren bfd2b829d9 Debug Info: add an assertion that the context field of a global variable can not
be a DIType with identifier.

This makes sure that there is no need to use DIScopeRef for global variable's
context.

rdar://18958417

llvm-svn: 222561
2014-11-21 19:47:48 +00:00
Manman Ren c98ec0e70a [Objective-C] Support a new special module flag that will be put into the
objc_imageinfo struct.

rdar://17954668

llvm-svn: 222558
2014-11-21 19:24:55 +00:00
Hans Wennborg cbb18e342f LazyValueInfo: range'ify some for-loops. No functional change.
llvm-svn: 222557
2014-11-21 19:07:46 +00:00
Rafael Espindola e973fd4a09 Add params() to FunctionType. NFC.
While at it, also use makeArrayRef in elements().

llvm-svn: 222556
2014-11-21 19:03:35 +00:00
Sanjay Patel eb4a4d5aeb Don't repeat class/function/variable names in comments. NFC.
llvm-svn: 222555
2014-11-21 18:58:38 +00:00
Hans Wennborg c5ec73d801 LazyValueInfo: fix some typos and indentation, etc. NFC.
llvm-svn: 222554
2014-11-21 18:58:23 +00:00
Rafael Espindola 334b73f470 Add and use a helper elements() to StructType. NFC.
llvm-svn: 222553
2014-11-21 18:53:05 +00:00
Matthias Braun 87a3ba6a6d Allow multiple -debug-only args
Debug output is shown if any of the -debug-only arguments match.

llvm-svn: 222547
2014-11-21 18:06:09 +00:00
Sanjay Patel b06441aded Less space; NFC
llvm-svn: 222546
2014-11-21 18:05:59 +00:00
Sanjay Patel 501890e909 Add a feature flag for slow 32-byte unaligned memory accesses [x86].
This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen
for Sandy Bridge and Ivy Bridge. There is no functionality change intended for 
those chips. Previously, the absence of AVX2 was being used as a proxy to detect
this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2
that do not have the 32-byte unaligned access slowdown.

Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ).

Differential Revision: http://reviews.llvm.org/D6355

llvm-svn: 222544
2014-11-21 17:40:04 +00:00
Duncan P. N. Exon Smith 44b82359c9 Revert "Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise."
This reverts commit r222538.

It's causing test failures for CFI, at least on Darwin:

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental/1189/
http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_check/1391/

Note that the previous incremental build was on r222537, and the CFI
tests weren't failing:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental/1188/

llvm-svn: 222542
2014-11-21 17:21:18 +00:00
Chandler Carruth ce5a26b0e7 [x86] Restructure the checking patterns for v16 and v32 avx2 vector
shuffle lowering to allow much better blend matching.

Specifically, with the new structure the code seems clearer to me and we
correctly can hit the cases where merging two 128-bit lanes is a clear
win and can be shuffled cheaply afterward.

llvm-svn: 222539
2014-11-21 14:53:03 +00:00
Joerg Sonnenberger f769ae1ac4 Allow FDE references outside the +/-2GB range supported by PC relative
offsets for code models other than small/medium. For JIT application,
memory layout is less controlled and can result in truncations
otherwise. 

Patch from Akos Kiss.

Differential Revision: http://reviews.llvm.org/D6079

llvm-svn: 222538
2014-11-21 14:42:43 +00:00
Chandler Carruth 6c4d1ea8c4 [x86] Make the previous logic significantly less conservative and get
a bunch more improvements.

Non-lane-crossing is fine, the key is that lane merging only makes sense
for single-input shuffles. Not sure why I got so turned around here. The
code all works, I was just using the wrong model for it.

This only updates v4 and v8 lowering. The v16 and v32 lowering requires
restructuring the entire check sequence.

llvm-svn: 222537
2014-11-21 14:33:24 +00:00
Andrea Di Biagio 0225b5bf6f [DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero.
Before this patch, the DAGCombiner only tried to convert build_vector dag nodes
into shuffles if all operands were either extract_vector_elt or undef.

This patch improves that logic and teaches the DAGCombiner how to deal with
build_vector dag nodes where one or more operands are zero. A build_vector
dag node with some zero operands is turned into a shuffle only if the resulting
shuffle mask is legal for the target.

llvm-svn: 222536
2014-11-21 14:32:06 +00:00
Chandler Carruth d2b19bc867 [x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit
lanes.

By special casing these we can often either reduce the total number of
shuffles significantly or reduce the number of (high latency on Haswell)
AVX2 shuffles that potentially cross 128-bit lanes. Even when these
don't actually cross lanes, they have much higher latency to support
that. Doing two of them and a blend is worse than doing a single insert
across the 128-bit lanes to blend and then doing a single interleaved
shuffle.

While this seems like a narrow case, it kept cropping up on me and the
difference is *huge* as you can see in many of the test cases. I first
hit this trying to perfectly fix the interleaving shuffle patterns used
by Halide for AVX2.

llvm-svn: 222533
2014-11-21 13:56:05 +00:00
Andrea Di Biagio 26e8f4d166 [DAG] Refactor the shuffle combining logic in DAGCombiner. NFC.
This patch simplifies the logic that combines a pair of shuffle nodes into
a single shuffle if there is a legal mask. Also added comments to better
describe the algorithm. No functional change intended.

llvm-svn: 222522
2014-11-21 11:33:07 +00:00
Alexey Volkov fd1731d876 [X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers
Differential Revision: http://reviews.llvm.org/D5938

llvm-svn: 222521
2014-11-21 11:19:34 +00:00
Yury Gribov 55441bb601 [asan] Add new hidden compile-time flag asan-instrument-allocas to sanitize variable-sized dynamic allocas. Patch by Max Ostapenko.
Reviewed at http://reviews.llvm.org/D6055

llvm-svn: 222519
2014-11-21 10:29:50 +00:00
NAKAMURA Takumi 7495dc76e8 Add LLVMScalarOpts to LLVMPowerPCCodeGen.
llvm-svn: 222516
2014-11-21 09:14:45 +00:00
Hao Liu 44e5d7a131 DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal.
E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip)

A hook is added to allow the target to control whether it needs to do such combine.

Reviewed in http://reviews.llvm.org/D6334

llvm-svn: 222510
2014-11-21 06:39:58 +00:00
Craig Topper 61e88f44f9 Remove a bunch of unnecessary typecasts to 'const TargetRegisterClass *'
llvm-svn: 222509
2014-11-21 05:58:21 +00:00
Hal Finkel f413be11f0 [PPC] Use SeparateConstOffsetFromGEP
This mirrors r222331, which enabled SeparateConstOffsetFromGEP on AArch64, in
the PowerPC backend. Yields, on a POWER7 machine, a 30% speedup on
SingleSource/Benchmarks/Shootout/nestedloop (this might just be from LICM,
there is a store moved out of the inner loop) and a potential speedup on
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode. Regardless, it
makes some code look cleaner, and synchronizing the backends in this regard
seems like a generally good thing.

llvm-svn: 222504
2014-11-21 04:35:51 +00:00
Richard Trieu e3d126cbbb Add accessor marcos to ConstantPlaceHolder, similar to those in the base class.
llvm-svn: 222502
2014-11-21 02:42:08 +00:00
David Majnemer 1f44142e4e This Reassociate change unintentionally slipped in r222499
llvm-svn: 222500
2014-11-21 02:37:38 +00:00
David Majnemer c0a313b57c SROA: The alloca type isn't a candidate promotion type for vectors
The alloca's type is irrelevant, only those types which are used in a
load or store of the exact size of the slice should be considered.

This manifested as an assertion failure when we compared the various
types: we had a size mismatch.

This fixes PR21480.

llvm-svn: 222499
2014-11-21 02:34:55 +00:00
Lang Hames b2dd9529eb [MCJIT] Remove JITEventListener::NotifyFreeingMachineCode. This method is dead
now that the old JIT has been removed.

llvm-svn: 222494
2014-11-21 01:57:09 +00:00
Zachary Turner 8325a5c6f8 Add curly braces to workaround an MSVC bug.
MSVC can't parse this pattern for range-based for loops.

llvm-svn: 222491
2014-11-21 01:19:09 +00:00
Quentin Colombet a7439d4483 [X86] Do not custom lower UINT_TO_FP when the target type does not
match the custom lowering.

<rdar://problem/19026326>

llvm-svn: 222489
2014-11-21 00:47:19 +00:00
Adrian Prantl 940257f7e4 Verifier: Check that all instructions have their parent pointers set up
correctly. This helps with catching problems caused by IRBuilder abuse
such as the one fixed in CFE r222487.

llvm-svn: 222488
2014-11-21 00:39:43 +00:00
Reid Kleckner 343c395f11 Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/
Follow up to r221940, where I must not have caught em all. NFC

llvm-svn: 222481
2014-11-20 23:51:47 +00:00
Reid Kleckner 357600eab5 Add out of line virtual destructors to all LLVMTargetMachine subclasses
These recently all grew a unique_ptr<TargetLoweringObjectFile> member in
r221878.  When anyone calls a virtual method of a class, clang-cl
requires all virtual methods to be semantically valid. This includes the
implicit virtual destructor, which triggers instantiation of the
unique_ptr destructor, which fails because the type being deleted is
incomplete.

This is just part of the ongoing saga of PR20337, which is affecting
Blink as well. Because the MSVC ABI doesn't have key functions, we end
up referencing the vtable and implicit destructor on any virtual call
through a class. We don't actually end up emitting the dtor, so it'd be
good if we could avoid this unneeded type completion work.

llvm-svn: 222480
2014-11-20 23:37:18 +00:00
Mehdi Amini fee89e43ad Update Makefile following directory removal in r222466
llvm-svn: 222475
2014-11-20 22:48:24 +00:00
Mehdi Amini ffd0100618 SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

This is basically r222416, after fixes for MSVC lack of standard 
support, and a few cleaning (got rid of a warning).
Thanks Nakamura Takumi and Nico Weber for the MSVC fixes.

llvm-svn: 222472
2014-11-20 22:40:25 +00:00
Colin LeMahieu ff06261aed [Hexagon] [NFC] Merging InstPrinter directory in to MCTargetDesc since they have a circular dependency.
llvm-svn: 222458
2014-11-20 21:56:35 +00:00
Lang Hames a4be967eba [MCJIT] Remove JITEventListener::NotifyFunctionEmitted - this method is dead
now that the legacy JIT has been removed.

llvm-svn: 222453
2014-11-20 21:16:16 +00:00
Michael Zolotukhin 0dcae71449 Fix a trip-count overflow issue in LoopUnroll.
Currently LoopUnroll generates a prologue loop before the main loop
body to execute first N%UnrollFactor iterations. Also, this loop is
used if trip-count can overflow - it's determined by a runtime check.

However, we've been mistakenly optimizing this loop to a linear code for
UnrollFactor = 2, not taking into account that it also serves as a safe
version of the loop if its trip-count overflows.

llvm-svn: 222451
2014-11-20 20:19:55 +00:00
Saleem Abdulrasool 2f3b3f3182 X86: use the correct alloca symbol for Windows Itanium
Windows itanium targets the MSVCRT, and the stack probe symbol is provided by
MSVCRT.  This corrects the emission of stack probes on i686-windows-itanium.

llvm-svn: 222439
2014-11-20 18:01:26 +00:00
Frederic Riss 77a0743216 Make DWARFAcceleratorTable::dump() const.
As dump() methods  should be. To allow that, do not store the DWARFFormValue
objects used for the dump in the header data.

Per Alexey's suggestion!

llvm-svn: 222436
2014-11-20 16:21:11 +00:00
Frederic Riss 7c41c64db7 Add missing copyright headers.
llvm-svn: 222435
2014-11-20 16:21:06 +00:00
Frederic Riss e10ba6dd56 Do not create a replaceable Variables MDNode for function forward decls.
These fields would need to be explicitly deleted before we RAUW the temporary
node anyway (this was done in cfe commit r222373). Instead, do not create
these useless nodes in the first place.

llvm-svn: 222434
2014-11-20 15:52:34 +00:00
Timur Iskhodzhanov 71526a3eda Revert r222416, r222422, r222426: the former revision had problems and fixing them introduced bugs
llvm-svn: 222428
2014-11-20 12:36:43 +00:00
Timur Iskhodzhanov a0bffc0c11 Fix a typo
llvm-svn: 222426
2014-11-20 11:48:58 +00:00
NAKAMURA Takumi 5a83192570 SimplifyCFG.cpp: Tweak to let msc17 compliant.
- Use LLVM_DELETED_FUNCTION.
  - Don't use member initializers.
  - Don't use initializer list.

llvm-svn: 222422
2014-11-20 08:59:02 +00:00
Mehdi Amini 65253e76ed SimplifyCFG: Refactor GatherConstantCompares() result in a struct
Code seems cleaner and easier to understand this way

llvm-svn: 222416
2014-11-20 06:51:02 +00:00
Jyoti Allur 5b9f35220e [ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32
llvm-svn: 222414
2014-11-20 05:58:11 +00:00
Craig Topper e9891afa46 Fix a typo in a comment.
llvm-svn: 222412
2014-11-20 05:22:37 +00:00
Alexey Samsonov cfb97aa620 Remove support for undocumented SpecialCaseList entries.
"global-init", "global-init-src" and "global-init-type" were originally
used to blacklist entities in ASan init-order checker. However, they
were never documented, and later were replaced by "=init" category.

Old blacklist entries should be converted as follows:
  * global-init:foo -> global:foo=init
  * global-init-src:bar -> src:bar=init
  * global-init-type:baz -> type:baz=init

llvm-svn: 222401
2014-11-20 01:27:19 +00:00
Colin LeMahieu ac00643603 [Hexagon] Adding A2_xor instruction with IR selection pattern and test.
llvm-svn: 222399
2014-11-19 23:22:23 +00:00
Chad Rosier 90a2f9b110 Revert "[Reassociate] As the expression tree is rewritten make sure the operands are"
This reverts commit r222142.  This is causing/exposing an execution-time regression
in spec2006/gcc and coremark on AArch64/A57/Ofast.

Conflicts:

	test/Transforms/Reassociate/optional-flags.ll

llvm-svn: 222398
2014-11-19 23:21:20 +00:00
Colin LeMahieu 21866546ae [Hexagon] Adding A2_or instruction with IR selection pattern and test.
llvm-svn: 222396
2014-11-19 22:58:04 +00:00
Nico Weber 06839a536f Try to fix MSVS build after r222384. No intended behavior change.
llvm-svn: 222386
2014-11-19 21:16:11 +00:00
Mehdi Amini 9a25cb8806 SimplifyCFG: turn recursive GatherConstantCompares into iterative
A long sequence of || or && could lead to a stack explosion.

llvm-svn: 222384
2014-11-19 20:09:11 +00:00
Matthias Braun 745ea43687 RegisterCoalescer: Improve debug messages
- Show "Considering..." message after flipping so you actually see the final
  destination vreg as destination.
- Add a message on final join, so you can grep for "Success" messages to obtain
  a list of which register got merged with which.

llvm-svn: 222382
2014-11-19 19:46:17 +00:00
Matthias Braun d2f4c77800 Add a print and verify pass after the RegisterCoalescer
llvm-svn: 222381
2014-11-19 19:46:15 +00:00
Matthias Braun 47760d9667 MachineVerifier: Report register for bad liveranges
llvm-svn: 222380
2014-11-19 19:46:13 +00:00
Matthias Braun 9f87d75060 Introduce register dump helper
llvm-svn: 222379
2014-11-19 19:46:11 +00:00
David Majnemer 3563938ee4 AliasSet: Simplify mergeSetIn
No functional change intended.

llvm-svn: 222376
2014-11-19 19:36:18 +00:00
Andrea Di Biagio 1b657bfcc8 [X86] Improved lowering of v4x32 build_vector dag nodes.
This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes
that are known to have at least two non-zero elements.

With this patch, a build_vector that performs a blend with zero is 
converted into a shuffle. This is done to let the shuffle legalizer expand
the dag node in a optimal way. For example, if we know that a build_vector
performs a blend with zero, we can try to lower it as a movq/blend instead of
always selecting an insertps.

This patch also improves the logic that lowers a build_vector into a insertps
with zero masking. See for example the extra test cases added to test sse41.ll.

Differential Revision: http://reviews.llvm.org/D6311

llvm-svn: 222375
2014-11-19 19:34:29 +00:00
Lang Hames 56c0eb2d90 [ADT] Fix PR20728 - Incorrect APFloat::fusedMultiplyAdd results for x86_fp80.
As detailed at http://llvm.org/PR20728, due to an internal overflow in
APFloat::multiplySignificand the APFloat::fusedMultiplyAdd method can return
incorrect results for x87DoubleExtended (x86_fp80) values. This commonly
manifests as incorrect constant folding of libm fmal calls on x86. E.g.

fmal(1.0L, 1.0L, 3.0L) == 0.0L      (should be 4.0L)

This patch fixes PR20728 by adding an extra bit to the significand for
intermediate results of APFloat::multiplySignificand, avoiding the overflow.

llvm-svn: 222374
2014-11-19 19:15:41 +00:00
Tom Stellard e0ddfd11ea R600/SI: Make SIInstrInfo::isOperandLegal() more strict
A register operand that has a common sub-class with its instruction's
defined register class is not always legal.  For example,
SReg_32 and M0Reg both have a common sub-class, but we can't
use an SReg_32 in instructions that expect a M0Reg.

This prevents the llvm.SI.sendmsg.ll test from failing when the fold
operand pass is added.

llvm-svn: 222368
2014-11-19 16:58:49 +00:00
Zoran Jovanovic a4c4b5fc01 [mips][micromips] Implement SWM32 and LWM32 instructions
Differential Revision: http://reviews.llvm.org/D5519

llvm-svn: 222367
2014-11-19 16:44:02 +00:00
Suyog Sarda aba97f4aba Vectorize a reduction chain feeding into a 'return' statement.
e.x 
return (a[0]+b[0]) + (a[1]+b[1])

Differential Revision: http://reviews.llvm.org/D6227

llvm-svn: 222364
2014-11-19 16:07:38 +00:00
Jozef Kolek ffeed44190 [mips][microMIPS] Fix opcodes of MFHC1 and MTHC1 instructions.
Differential Revision: http://reviews.llvm.org/D6169

llvm-svn: 222355
2014-11-19 13:37:51 +00:00
Arnaud A. de Grandmaison 7b9dc28060 Fix tail recursion elimination
When the BasicBlock containing the return instrution has a PHI with 2
incoming values, FoldReturnIntoUncondBranch will remove the no longer
used incoming value and remove the no longer needed phi as well. This
leaves us with a BB that no longer has a PHI, but the subsequent call
to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not
remove the return instruction (which still uses the result of the call
instruction). This prevents EliminateRecursiveTailCall to remove
the value, as it is still being used in a basicblock which has no
predecessors.

The basicblock can not be erased on the spot, because its iterator is
still being used in runTRE.

This issue was exposed when removing the threshold on size for lifetime
marker insertion for named temporaries in clang. The testcase is a much
reduced version of peelOffOuterExpr(const Expr*, const ExplodedNode *)
from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp.

llvm-svn: 222354
2014-11-19 13:32:51 +00:00
Jozef Kolek 4d55b4d768 [mips][microMIPS] Implement CodeGen support for 16-bit instruction ADDIUR2.
Differential Revision: http://reviews.llvm.org/D5800

llvm-svn: 222352
2014-11-19 13:23:58 +00:00
Jozef Kolek 73f64eac8c [mips][microMIPS] Implement CodeGen support for ADDIUS5 instruction.
Differential Revision: http://reviews.llvm.org/D5799

llvm-svn: 222351
2014-11-19 13:11:09 +00:00
Jozef Kolek 5f95dd2b65 [mips][microMIPS] Implement LWXS instruction.
Differential Revision: http://reviews.llvm.org/D5407

llvm-svn: 222348
2014-11-19 11:39:12 +00:00
Jozef Kolek dc62fc4a8f [mips][microMIPS] Implement SDBBP and RDHWR instructions.
Differential Revision: http://reviews.llvm.org/D5240

llvm-svn: 222347
2014-11-19 11:25:50 +00:00
Simon Pilgrim 3ac3b251a9 [X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2
This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain.

I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr.

Differential Revision: http://reviews.llvm.org/D5699

llvm-svn: 222340
2014-11-19 10:06:49 +00:00
David Majnemer b7adf34ee0 AliasSetTracker: UnknownInsts should contribute to the refcount
AliasSetTracker::addUnknown may create an AliasSet devoid of pointers
just to contain an instruction if no suitable AliasSet already exists.
It will then AliasSet::addUnknownInst and we will be done.

However, it's possible for addUnknown to choose an existing AliasSet to
addUnknownInst.
If this were to occur, we are in a bit of a pickle: removing pointers
from the AliasSet can cause the entire AliasSet to become destroyed,
taking our unknown instructions out with them.

Instead, keep track whether or not our AliasSet has any unknown
instructions.

This fixes PR21582.

llvm-svn: 222338
2014-11-19 09:41:05 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Hao Liu 2aa06a989d [AArch64] Disable useAA for Cortex-A57.
Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find
enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved 
and benefical for ooo cores, we can enable it again.

llvm-svn: 222333
2014-11-19 06:48:56 +00:00
Hao Liu fd46bea46a [AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend.
SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE
and LICM. By enabling these passes we can have better address calculations and generate a better addressing
mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57.

Reviewed in http://reviews.llvm.org/D5864.

llvm-svn: 222331
2014-11-19 06:39:53 +00:00
Hao Liu 1d2a061bd8 [SeparateConstOffsetFromGEP] Allow SeparateConstOffsetFromGEP pass to lower GEPs.
If LowerGEP is enabled, it can lower a GEP with multiple indices into GEPs with a single index
or arithmetic operations. Lowering GEPs can always extract structure indices. Lowering GEPs can
also give use more optimization opportunities. It can benefit passes like CSE, LICM and CGP.

Reviewed in http://reviews.llvm.org/D5864

llvm-svn: 222328
2014-11-19 06:24:44 +00:00
David Blaikie 5106ce7897 Remove StringMap::GetOrCreateValue in favor of StringMap::insert
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)

Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.

llvm-svn: 222319
2014-11-19 05:49:42 +00:00
Rui Ueyama 970dda295e llvm-readobj: fix off-by-one error in COFFDumper
It printed out base relocation table header as table entry.
This patch also makes llvm-readobj to not skip ABSOLUTE entries
becuase it was confusing.

llvm-svn: 222299
2014-11-19 02:07:10 +00:00
Weiming Zhao 7a2d15678e [Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availability
llvm-svn: 222292
2014-11-19 00:29:14 +00:00
Kostya Serebryany cb45b126fb [asan] add experimental basic-block tracing to asan-coverage; also fix -fsanitize-coverage=3 which was broken by r221718
llvm-svn: 222290
2014-11-19 00:22:58 +00:00
Rui Ueyama 74e85130a0 llvm-readobj: teach it how to dump COFF base relocation table
llvm-svn: 222289
2014-11-19 00:18:07 +00:00
Kostya Serebryany e5ea424a77 Introduce llvm::SplitAllCriticalEdges
Summary:
move the code from BreakCriticalEdges::runOnFunction()
into a separate utility function llvm::SplitAllCriticalEdges()
so that it can be used independently.
No functionality change intended.

Test Plan: check-llvm

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6313

llvm-svn: 222288
2014-11-19 00:17:31 +00:00
Manman Ren c67109313c Revert r222039 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green. If not, we will re-submit the commit.

llvm-svn: 222287
2014-11-19 00:13:26 +00:00
Matt Arsenault c09cc3c5b0 R600/SI: Implement areMemAccessesTriviallyDisjoint
This partially makes up for not having address spaces
used for alias analysis in some simple cases.

This is not yet enabled by default so shouldn't change anything yet.

llvm-svn: 222286
2014-11-19 00:01:31 +00:00
Matt Arsenault 9a072c19ae R600/SI: Set hasSideEffects = 0 on load and store instructions.
Assuming unmodeled side effects interferes with some scheduling
opportunities.

Don't put it in the base class of DS instructions since there
are a few weird effecting, non load/store instructions there.

llvm-svn: 222285
2014-11-18 23:57:33 +00:00
Simon Pilgrim 9c1e4123f8 [X86][AVX] 256-bit vector stack unaligned load/stores identification
Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills.

This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills.

This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll

Differential Revision: http://reviews.llvm.org/D6252

llvm-svn: 222281
2014-11-18 23:38:19 +00:00
Colin LeMahieu 44fd1c8bdf [Hexagon] Adding A2_and instruction.
llvm-svn: 222274
2014-11-18 22:45:47 +00:00
Chad Rosier c250881838 [FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic
shift-right for booleans (i1).

Arithmetic shift-right immediate with sign-/zero-extensions also works for
boolean values.  Update the assert and the test cases to reflect that fact.

llvm-svn: 222272
2014-11-18 22:41:49 +00:00
Chad Rosier e16d16ae41 [FastISel][AArch64] Also allow folding of sign-/zero-extend and logical
shift-right for booleans (i1).

Logical shift-right immediate with sign-/zero-extensions also works for boolean
values.  Update the assert and the test cases to reflect that fact.

llvm-svn: 222270
2014-11-18 22:38:42 +00:00
David Majnemer c6b8e20a5c InstCombine: Fix another infinite loop caused by visitFPTrunc
We would attempt to replace an frem's operand with the same operand.
This would cause InstCombine to think real work was done, causing
InstCombine to enter an infinite loop.

This fixes the second part of PR21576.

llvm-svn: 222265
2014-11-18 22:06:45 +00:00
Colin LeMahieu 38765e6d89 [Hexagon] Adding A2_sub instruction
Renaming test files.

llvm-svn: 222263
2014-11-18 21:51:51 +00:00
David Majnemer b32eaddf11 Revert "Revert r222040 because of bot failure."
This reverts commit r222203, reverting r222040 didn't end up turning the
bot green.

llvm-svn: 222261
2014-11-18 21:30:02 +00:00
Juergen Ributzka cdda930843 [FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts."
Shifts also perform sign-/zero-extends to larger types, which requires us to emit
an integer extend instead of a simple COPY.

Related to PR21594.

llvm-svn: 222257
2014-11-18 21:20:17 +00:00
Matt Arsenault 162c1010bd R600/SI: Move SIFixSGPRCopies to inst selector passes
This should expose more of the actually used VALU
instructions to the machine optimization passes.

This also should help getting i1 handling into a better state.
For not entirly understood reasons, this fixes the split-scalar-i64-add.ll
test where a 64-bit add would only partially be moved to the VALU
resulting in use of undefined VCC.

llvm-svn: 222256
2014-11-18 21:06:58 +00:00
Juergen Ributzka 7a7c4684e4 [AArch64] Don't optimize all compare instructions.
"optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add
instructions when the flags are not used anymore. This conversion is valid for
most instructions, but not all. Some instructions that don't set the flags
(e.g. sub with immediate) can set the SP, whereas the flag setting version uses
the same encoding for the "zero" register.

Update the code to also check for the return register before performing the
optimization to make sure that a cmp doesn't suddenly turn into a sub that sets
the stack pointer.

I don't have a test case for this, because it isn't easy to trigger.

llvm-svn: 222255
2014-11-18 21:02:40 +00:00
Owen Anderson b5a259935c Fix an incorrect chain operand when expanding INSERT_VECTOR operations through the stack.
Patch by Daniil Troshkov!

llvm-svn: 222254
2014-11-18 20:50:19 +00:00
Tom Stellard f0a2107c6b R600/SI: Make sure resource descriptors are always stored in SGPRs
llvm-svn: 222253
2014-11-18 20:39:39 +00:00
Colin LeMahieu efa74e0280 [Hexagon] Converting from ADD_rr to A2_add which has encoding bits.
Adding test to show correct instruction selection and encoding.

llvm-svn: 222249
2014-11-18 20:28:11 +00:00
Chad Rosier e53e8c8e58 [Reassociate] Rename local variable to not use same name as a member
variable. NFC.

llvm-svn: 222248
2014-11-18 20:21:54 +00:00
Juergen Ributzka 4328fd94b0 [FastISel][AArch64] Fix shift-immediate emission for "zero" shifts.
This change emits a COPY for a shift-immediate with a "zero" shift value.
This fixes PR21594 where we emitted a shift instruction with an incorrect
immediate operand.

llvm-svn: 222247
2014-11-18 19:58:59 +00:00
Jozef Kolek 52e84e99a1 Test commit to verify that commit access works.
llvm-svn: 222244
2014-11-18 19:20:34 +00:00
Philip Reames 018dbf18c4 Tweak EarlyCSE to recognize series of dead stores
EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).

Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301

llvm-svn: 222241
2014-11-18 17:46:32 +00:00
David Majnemer 6fdb6b8fd4 InstCombine: Fold away tautological masked compares
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.

While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.

llvm-svn: 222230
2014-11-18 09:31:41 +00:00
David Majnemer 1a3327bb62 InstCombine: Clean up foldLogOpOfMaskedICmps
No functional change intended.

llvm-svn: 222229
2014-11-18 09:31:36 +00:00
Frederic Riss fdccfc1e19 Allow DwarfCompileUnit::constructImportedEntityDIE to instanciate a GlobalVariable DIE.
Usually global variables are in a retain list and instanciated before
any call to constructImportedEntityDIE is made. This isn't true for
forward declarations though.
The testcase for this change is generated by a clang patched to emit
such forward declarations (patch at http://reviews.llvm.org/D6173
which will land soon). The updated testcase tests more than just
global variables, it now tests every type of 'using' clause we
support.

llvm-svn: 222217
2014-11-18 02:46:11 +00:00
Hans Wennborg a6a11a969a SimplifyCFG: Range'ify some for-loops. No functional change.
llvm-svn: 222215
2014-11-18 02:37:11 +00:00
David Majnemer 9a91e4a18a IndVarSimplify: Allow LFTR to fire more often
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.

Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.

Differential Revision: http://reviews.llvm.org/D6222

llvm-svn: 222213
2014-11-18 02:20:58 +00:00
Duncan P. N. Exon Smith 4db24cc49b IR: Sink MDNode::Hash down to GenericMDNode::Hash
Part of PR21532.

llvm-svn: 222212
2014-11-18 02:20:29 +00:00
Duncan P. N. Exon Smith c23610b1e4 IR: Move MDNode operands from the back to the front
Having the operands at the back prevents subclasses from safely adding
fields.  Move them to the front.

Instead of replicating the custom `malloc()`, `free()` and `DestroyFlag`
logic that was there before, overload `new` and `delete`.

I added calls to a new `GenericMDNode::dropAllReferences()` in
`LLVMContextImpl::~LLVMContextImpl()`.  There's a maze of callbacks
happening during teardown, and this resolves them before we enter
the destructors.

Part of PR21532.

llvm-svn: 222211
2014-11-18 01:56:14 +00:00
Michael J. Spencer 21245af8e7 Fix covered switch warning
llvm-svn: 222209
2014-11-18 01:26:46 +00:00
Michael J. Spencer bbd875b6ad Support ELF files of unknown type.
llvm-svn: 222208
2014-11-18 01:14:25 +00:00
Duncan P. N. Exon Smith 50846f80ac IR: Split MDNode into GenericMDNode and MDNodeFwdDecl
Split `MDNode` into two classes:

  - `GenericMDNode`, which is uniquable (and for now, always starts
    uniqued).  Once `Metadata` is split from the `Value` hierarchy, this
    class will lose the ability to RAUW itself.

  - `MDNodeFwdDecl`, which is used for the "temporary" interface, is
    never uniqued, and isn't managed by `LLVMContext` at all.

I've left most of the guts in `MDNode` for now, but I'll incrementally
move things to the right places (or delete the functionality, as
appropriate).

Part of PR21532.

llvm-svn: 222205
2014-11-18 00:37:17 +00:00
Manman Ren a64bd44fd8 Revert r222040 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green.

llvm-svn: 222203
2014-11-18 00:33:22 +00:00
Manman Ren 554865da5b Debug Info: In DIBuilder, the context field of a global variable is updated to
use DIScopeRef.

A paired commit at clang will follow to show cases where we will use an
identifer for the context of a global variable.

rdar://18958417

llvm-svn: 222195
2014-11-18 00:29:08 +00:00
Duncan P. N. Exon Smith f39c3b8108 IR: Simplify uniquing for MDNode
Change uniquing from a `FoldingSet` to a `DenseSet` with custom
`DenseMapInfo`.  Unfortunately, this doesn't save any memory, since
`DenseSet<T>` is a simple wrapper for `DenseMap<T, char>`, but I'll come
back to fix that later.

I used the name `GenericDenseMapInfo` to the custom `DenseMapInfo` since
I'll be splitting `MDNode` into two classes soon: `MDNodeFwdDecl` for
temporaries, and `GenericMDNode` for everything else.

I also added a non-debug-info reduced version of a type-uniquing test
that started failing on an earlier draft of this patch.

Part of PR21532.

llvm-svn: 222191
2014-11-17 23:28:21 +00:00
Reid Kleckner d970702ab3 Revert "ADT: correctly report isMSVCEnvironment for windows itanium"
This reverts commit r222180.

llvm-svn: 222188
2014-11-17 22:55:59 +00:00
Saleem Abdulrasool 76f2c77070 ADT: correctly report isMSVCEnvironment for windows itanium
The itanium environment on Windows uses MSVC and is a MSVC environment.  Report
this correctly.

llvm-svn: 222180
2014-11-17 22:13:26 +00:00
Matt Arsenault 7480a0e163 R600/SI: Don't copy flags when extracting subreg
This was resulting in use of a register after a kill.
For some reason this showed up as a problem in many tests
when moving the SIFixSGPRCopies pass closer to instruction
selection.

llvm-svn: 222175
2014-11-17 21:11:37 +00:00
Matt Arsenault 6f679785f4 R600/SI: Assume SIFixSGPRCopies makes changes
I'm not sure if this was breaking anything.

llvm-svn: 222174
2014-11-17 21:11:34 +00:00
Rafael Espindola 5cb9c82a5d Factor common code it Linker::init.
The TypeFinder was not being used in one of the constructors.

llvm-svn: 222172
2014-11-17 20:51:01 +00:00
Rafael Espindola 49e9bf8c74 Pass a reference to ValueEnumerator.
NFC. This will just make it easier to use std::unique_ptr in a caller.

llvm-svn: 222170
2014-11-17 20:06:27 +00:00
Juergen Ributzka c9591e9bdb [SimplifyCFG] Make the value type of the hole check bitmask a power-of-2.
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.

The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.

To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.

This fixes rdar://problem/18984639.

llvm-svn: 222168
2014-11-17 19:39:56 +00:00
Chad Rosier bc0b869be9 [Reassociate] As the expression tree is rewritten make sure the operands are
emitted in canonical form.

llvm-svn: 222142
2014-11-17 16:33:50 +00:00
Alexey Volkov 7de210bd52 [X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs
Differential Revision: http://reviews.llvm.org/D5934

llvm-svn: 222141
2014-11-17 16:17:51 +00:00
Chad Rosier 9a1ac6e494 [Reassociate] Canonicalize constants to RHS operand.
Fix a thinko where the RHS was already a constant.

llvm-svn: 222139
2014-11-17 15:52:51 +00:00
Renato Golin 609bf92365 Fix ARM triple parsing
The triple parser should only accept existing architecture names
when the triple starts with armv, armebv, thumbv or thumbebv.

Patch by Gabor Ballabas.

llvm-svn: 222129
2014-11-17 14:08:57 +00:00
David Majnemer 5d2670c52a ScalarEvolution: Construct SCEVDivision's Derived type instead of itself
SCEVDivision::divide constructed an object of SCEVDivision<Derived>
instead of Derived.  divide would call visit which would cast the
SCEVDivision<Derived> to type Derived.  As it happens,
SCEVDivision<Derived> and Derived currently have the same layout but
this is fragile and grounds for UB.

Instead, just construct Derived.  No functional change intended.

llvm-svn: 222126
2014-11-17 11:27:45 +00:00
Oliver Stannard 970b0d576c [Thumb1] Re-write emitThumbRegPlusImmediate
This was motivated by a bug which caused code like this to be
miscompiled:
  declare void @take_ptr(i8*)
  define void @test() {
    %addr1.32 = alloca i8
    %addr2.32 = alloca i32, i32 1028
    call void @take_ptr(i8* %addr1)
    ret void
  }

This was emitting the following assembly to get the value of %addr1:
  add r0, sp, #1020
  add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
  add r0, sp, #1020
  add r0, sp, #8

This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.

llvm-svn: 222125
2014-11-17 11:18:10 +00:00
David Majnemer 236b0ca790 Object, COFF: Tighten the object file parser
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
  are not.  In fact, they aren't really COFF files at all, we should
  probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
  end of the symbol table if you had a symbol with a bad list of
  auxiliary symbols.

llvm-svn: 222124
2014-11-17 11:17:17 +00:00
Oliver Stannard d29db9b949 Fix optimisations of SELECT_CC which assumed result is boolean
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.

This is a follow-up to D6210/r221693.

llvm-svn: 222123
2014-11-17 10:49:31 +00:00
Yaron Keren 428ceaf90a silence gcc 4.9.1 warning in /llvm/lib/Support/Windows/Path.inc:564:39:
warning: suggest parentheses around assignment used as truth value [-Wparentheses]
   if (ec = widenPath(path, path_utf16))

llvm-svn: 222122
2014-11-17 09:29:33 +00:00
Erik Eckstein 105374fe5e Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:

int f1(int x) {
  switch (x) {
    case 0: return 10;
    case 1: return 11;
    case 2: return 12;
    case 3: return 13;
  }
  return 0;
}

generates:

define i32 @f1(i32 %x) #0 {
entry:
  %0 = icmp ult i32 %x, 4
  br i1 %0, label %switch.lookup, label %return

switch.lookup:
  %switch.offset = add i32 %x, 10
  ret i32 %switch.offset

return:
  ret i32 0
}

llvm-svn: 222121
2014-11-17 09:13:57 +00:00
Craig Topper f98c606479 Add missing semicolon from r222118.
llvm-svn: 222119
2014-11-17 05:58:26 +00:00
Craig Topper cf0444ba2a Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries.
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.

llvm-svn: 222118
2014-11-17 05:50:14 +00:00
Rafael Espindola a3b5b60753 Add back r222061 with a fix.
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.

Original message:

Don't make assumptions about the name of private global variables.

Private variables are can be renamed, so it is not reliable to make
decisions on the name.

The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).

This patch changes one case where we were looking at the variable name to use
the section instead.

Test tuning by Michael Gottesman.

llvm-svn: 222117
2014-11-17 02:28:27 +00:00
Craig Topper 6438fc3d05 Replace a couple asserts with static_asserts.
llvm-svn: 222114
2014-11-17 00:26:50 +00:00
Craig Topper 7f416c8acb Convert some EVTs to MVTs where only a SimpleValueType is needed.
llvm-svn: 222109
2014-11-16 21:17:18 +00:00
David Majnemer 32b8ccf480 ScalarEvolution: Introduce SCEVSDivision and SCEVUDivision
It turns out that not all users of SCEVDivision want the same
signedness.  Let the users determine which operation they'd like by
explicitly choosing SCEVUDivision or SCEVSDivision.

findArrayDimensions and computeAccessFunctions will use SCEVSDivision
while HowFarToZero will use SCEVUDivision.

llvm-svn: 222104
2014-11-16 20:35:19 +00:00
Jingyue Wu 0fa125a77d [DependenceAnalysis] Allow subscripts of different types
Summary:
Several places in DependenceAnalysis assumes both SCEVs in a subscript pair
share the same integer type. For instance, isKnownPredicate calls
SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However,
DependenceAnalysis fails to ensure this assumption when producing a subscript
pair, causing tests such as NonCanonicalizedSubscript to crash. With this
patch, DependenceAnalysis runs unifySubscriptType before producing any
subscript pair, ensuring the assumption.

Test Plan:
Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix
crashed because subscripts have different types.

Reviewers: spop, sebpop, jingyue

Reviewed By: jingyue

Subscribers: eliben, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D6289

llvm-svn: 222100
2014-11-16 16:52:44 +00:00
Craig Topper 949d50bc71 [x86] Remove two redundant isel patterns. They equivalent already exists in the instruction pattern.
llvm-svn: 222094
2014-11-16 09:24:16 +00:00
David Majnemer 0df1d12476 ScalarEvolution: HowFarToZero was wrongly using signed division
HowFarToZero was supposed to use unsigned division in order to calculate
the backedge taken count.  However, SCEVDivision::divide performs signed
division.  Unless I am mistaken, no users of SCEVDivision actually want
signed arithmetic: switch to udiv and urem.

This fixes PR21578.

llvm-svn: 222093
2014-11-16 07:30:35 +00:00
David Majnemer 5854e9fae8 InstSimplify: Optimize ICmpInst xform that uses computeKnownBits
A few things:
- computeKnownBits is relatively expensive, let's delay its use as long
  as we can.
- Don't create two APInt values just to run computeKnownBits on a
  ConstantInt, we already know the exact value!
- Avoid creating a temporary APInt value in order to calculate unary
  negation.

llvm-svn: 222092
2014-11-16 02:20:08 +00:00
Andrea Di Biagio e13a0b81f4 [DAG] Improved target independent vector shuffle folding logic.
This patch teaches the DAGCombiner how to combine shuffles according to rules:
   shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2)
   shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2)
   shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2)

llvm-svn: 222090
2014-11-15 22:56:25 +00:00
Simon Pilgrim 6d675f4e35 [X86][SSE] Improve legal SHUFP and PSHUFD shuffle matching
Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input.

As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector.

Differential Revision: http://reviews.llvm.org/D6287

llvm-svn: 222087
2014-11-15 21:13:05 +00:00
Matt Arsenault 36094d788a R600: Permute operands when selecting legacy min/max
This gets the correct NaN behavior based on the compare type
the hardware uses. This now passes the new piglit test I have
for this on SI.

Add stricter tests for the operand order.

llvm-svn: 222079
2014-11-15 05:02:57 +00:00
Reid Kleckner 007239863e Revert "Don't make assumptions about the name of private global variables."
This reverts commit r222061.

It's causing linker errors.

llvm-svn: 222077
2014-11-15 02:03:53 +00:00
Tom Stellard 83171b32ed R600: Fix 64-bit integer division
This fixes a failure in one of the oclconform tests.

Patch by: Jan Vesely

llvm-svn: 222073
2014-11-15 01:07:57 +00:00
Tom Stellard bf69d76106 R600: Factor i64 UDIVREM lowering into its own fuction
This is so it could potentially be used by SI.  However, the current
implementation does not always produce correct results, so the
IntegerDivisionPass is being used instead.

llvm-svn: 222072
2014-11-15 01:07:53 +00:00
Duncan P. N. Exon Smith dbf64acd29 DIBuilder: Use Constant instead of Value
Make explicit the requirement that most IR values in `DIBuilder` are
`Constant`.  This requires a follow-up change in clang.

Part of PR21532.

llvm-svn: 222070
2014-11-15 00:23:49 +00:00
Duncan P. N. Exon Smith 774951fc2e DIBuilder: Change private helper function to static, NFC
llvm-svn: 222068
2014-11-15 00:05:04 +00:00
Duncan P. N. Exon Smith c81307af0f DI: Use Metadata for DITypeRef and DIScopeRef
Now that `MDString` and `MDNode` have a common base class, use it.  Note
that it's not useful to assume subclasses of `Metadata` must be one or
the other since we'll be adding more subclasses soon enough.

Part of PR21532.

llvm-svn: 222064
2014-11-14 23:55:03 +00:00
Reid Kleckner c2291f3905 Rename EH related stuff to be more precise
Summary:
The current "WinEH" exception handling type is more about Itanium-style
LSDA tables layered on top of the Windows native unwind info format
instead of .eh_frame tables or EHABI unwind info. Use the name
"ItaniumWinEH" to better reflect the hybrid nature of the design.

Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions,
since the LSDA is part of the Itanium C++ ABI document, and not the
DWARF standard.

Reviewers: echristo

Subscribers: llvm-commits, compnerd

Differential Revision: http://reviews.llvm.org/D6279

llvm-svn: 222062
2014-11-14 23:31:07 +00:00
Rafael Espindola 2fc723099f Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.

The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).

This patch changes one case where we were looking at the variable name to use
the section instead.

Test tuning by Michael Gottesman.

llvm-svn: 222061
2014-11-14 23:17:47 +00:00
Tim Northover 603d316517 ARM: refactor .cfi_def_cfa_offset emission.
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.

It's easier to just add any stack-adjusting instruction to a list and emit them
together.

llvm-svn: 222057
2014-11-14 22:45:33 +00:00
Tim Northover 9d2d218f49 ARM: correctly calculate the offset of FP in its push.
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.

The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.

llvm-svn: 222056
2014-11-14 22:45:31 +00:00
David Majnemer 8c3d92e7e5 InstCombine: Fix infinite loop caused by visitFPTrunc
We would attempt to replace a fptrunc of an frem with an identical
fptrunc.  This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.

This fixes PR21576.

llvm-svn: 222040
2014-11-14 21:21:15 +00:00
Chad Rosier 1ff4c0bf0b Reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll

The failing test is sensitive to the order in which we process loads.  This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.

This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.

llvm-svn: 222039
2014-11-14 21:09:13 +00:00
Tom Stellard e63d5ed2f9 R600/SI: Mark s_movk_i32 as rematerializable
llvm-svn: 222037
2014-11-14 20:43:28 +00:00
Tom Stellard bdd567d86d R600/SI: Fix spilling of m0 register
If we have spilled the value of the m0 register, then we need to restore
it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't
write to m0.

v_readlane_b32 can't write to m0, so

llvm-svn: 222036
2014-11-14 20:43:26 +00:00
Frederic Riss 3f1a0a7ce2 COFF: Add support for Dwarf accelerator tables.
This allows COFF targets to emit accelerator tables
when requested by -dwarf-accel-tables=Enable instead
of aborting. The test DebugInfo/cross-cu-inlining.ll
covers this on COFF platforms.

llvm-svn: 222034
2014-11-14 20:33:40 +00:00
Matt Arsenault cc3c2b3946 R600/SI: Combine min3/max3 instructions
llvm-svn: 222032
2014-11-14 20:08:52 +00:00
Frederic Riss 7c50047684 [dwarfdump] Handle relocations in Dwarf accelerator tables
ELF targets (and maybe COFF) use relocations when referring
to strings in the .debug_str section. Handle that in the
accelerator table dumper. This commit restores the
test/DebugInfo/cross-cu-inlining.ll test to its expected
platform independant form, validating that the fix works
(this test failed on linux boxes).

llvm-svn: 222029
2014-11-14 19:30:08 +00:00
David Blaikie 711cd9c53c Remove redundant virtual on overriden functions.
llvm-svn: 222023
2014-11-14 19:06:36 +00:00
Duncan P. N. Exon Smith 224e8c0943 IR: Make MDString inherit from Metadata
llvm-svn: 222022
2014-11-14 18:45:40 +00:00
Matt Arsenault 72858935f7 R600/SI: Fix verifier error from a branch on IMPLICIT_DEF
SIILowerI1Copies wasn't correctly handling this case.

llvm-svn: 222020
2014-11-14 18:43:41 +00:00
Duncan P. N. Exon Smith a69934fdc4 IR: Take an LLVMContext in Metadata::Metadata()
llvm-svn: 222019
2014-11-14 18:42:09 +00:00
Duncan P. N. Exon Smith 46d91ad4b6 Add a blank line, NFC
llvm-svn: 222018
2014-11-14 18:42:06 +00:00
Matt Arsenault 6ad34266e3 Fix unused variable warning without asserts
llvm-svn: 222017
2014-11-14 18:40:49 +00:00
Matt Arsenault d28a7fde32 R600/SI: Match integer min / max instructions
llvm-svn: 222015
2014-11-14 18:30:06 +00:00
Matt Arsenault 94812216ef R600/SI: Use S_BFE_I64 for 64-bit sext_inreg
llvm-svn: 222012
2014-11-14 18:18:16 +00:00
Chad Rosier df8f2a23cb [Reassociate] Canonicalize the operands of all binary operators.
llvm-svn: 222008
2014-11-14 17:09:19 +00:00
Chad Rosier d99df68e19 [Reassociate] Canonicalize operands of vector binary operators.
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions.  We now canonicalize add, mul, and, or, and xor
vector instructions.

llvm-svn: 222006
2014-11-14 17:08:15 +00:00
Chad Rosier f8b55f1bc5 [Reassociate] Canonicalize constants to RHS operand.
llvm-svn: 222005
2014-11-14 17:05:59 +00:00
Frederic Riss e837ec29c3 Reapply "[dwarfdump] Add support for dumping accelerator tables."
This reverts commit r221842 which was a revert of r221836 and of the
test parts of r221837.

This new version fixes an UB bug pointed out by David (along with
addressing some other review comments), makes some dumping more
resilient to broken input data and forces the accelerator tables
to be dumped in the tests where we use them (this decision is
platform specific otherwise).

llvm-svn: 222003
2014-11-14 16:15:53 +00:00
Cameron McInally 04400449c5 [AVX512] Add 512b masked integer shift by immediate patterns.
llvm-svn: 222002
2014-11-14 15:43:00 +00:00
Chad Rosier f59e548ba7 [Reassociate] Improve rank debug information. NFC.
llvm-svn: 221999
2014-11-14 15:01:38 +00:00
Tom Stellard 9dec074399 R600/SI: Fix assembly names for exec_hi and exec_lo
llvm-svn: 221995
2014-11-14 14:08:04 +00:00
Tom Stellard 9d7ddd516e R600/SI: Start implementing an assembler
This was done using the Sparc and PowerPC AsmParsers as guides.  So far it
is very simple and only supports sopp instructions.

llvm-svn: 221994
2014-11-14 14:08:00 +00:00
Bill Schmidt 7674692961 [PowerPC] Add VSX builtins for vec_div
This patch adds builtin support for xvdivdp and xvdivsp, along with a
test case.  Straightforward stuff.

There's a companion patch for Clang.

llvm-svn: 221983
2014-11-14 12:10:40 +00:00
David Majnemer f69b0585c1 obj2yaml, yaml2obj: Add support for COFF executables
In support of serializing executables, obj2yaml now records the virtual address
and size of sections.  It also serializes whatever we strictly need from
the PE header, it expects that it can reconstitute everything else via
inference.

yaml2obj can reconstitute a fully linked executable.

In order to get executables correctly serialized/deserialized, other
bugs were fixed as a circumstance.  We now properly respect file and
section alignments.  We also avoid writing out string tables unless they
are strictly necessary.

llvm-svn: 221975
2014-11-14 08:15:42 +00:00
NAKAMURA Takumi 548d7f614f SearchForAddressOfSymbol(): Disable 3 symbols, copysignf, fminf, and fmaxf, on msc17. *These were added in VS 2013*
llvm-svn: 221971
2014-11-14 04:53:55 +00:00
Matt Arsenault 21c938e14b R600/SI: Make constant array static
llvm-svn: 221965
2014-11-14 02:21:58 +00:00
Justin Bogner d5fca92425 llvm-cov: Sink some reporting logic into CoverageMapping
This teaches CoverageMapping::getCoveredFunctions to filter to a
particular file and uses that to replace most of the logic found in
llvm-cov report.

llvm-svn: 221962
2014-11-14 01:50:32 +00:00
Tim Northover d3be12a6c7 X86: use getConstant rather than getTargetConstant behind BUILD_VECTOR.
getTargetConstant should only be used when you can guarantee the instruction
selected will be able to cope with the raw value. BUILD_VECTOR is rather too
generic for this so we should use getConstant instead. In that case, an
instruction can still consume the constant, but if it doesn't it'll be
materialised through its own round of ISel.

Should fix PR21352.

llvm-svn: 221961
2014-11-14 01:30:14 +00:00
Duncan P. N. Exon Smith f17e740157 IR: Rewrite uniquing and creation of MDString
Stop using `Value::getName()` to get the string behind an `MDString`.
Switch to `StringMapEntry<MDString>` so that we can find the string by
its coallocation.

This is part of PR21532.

llvm-svn: 221960
2014-11-14 01:17:09 +00:00
David Blaikie a92765ca32 Fix 80 cols caught by the linter...
We have a linter running in our build now?

llvm-svn: 221957
2014-11-14 00:41:42 +00:00
Reid Kleckner d378174d54 Fix build of Mips code with MSVC by using our macro instead of __attribute__((unused)) directly
llvm-svn: 221956
2014-11-14 00:39:33 +00:00
Reid Kleckner 283bc2ed28 Allow the use of functions as typeinfo in landingpad clauses
This is one step towards supporting SEH filter functions in LLVM.

llvm-svn: 221954
2014-11-14 00:35:50 +00:00
Duncan P. N. Exon Smith 77349e7aaf IR: Make MDString::getName() private
Hide the fact that `MDString`'s string is stored in `Value::Name` --
that's going to change soon.  Update the only in-tree client that was
using it instead of `Value::getString()`.

Part of PR21532.

llvm-svn: 221951
2014-11-13 23:59:16 +00:00
Reid Kleckner ffafda277d Fix the VS 2012 build
VS 2012 doesn't have fminf or fmaxf.

llvm-svn: 221949
2014-11-13 23:45:50 +00:00
Reed Kotler d5c4196cb6 First stage of call lowering for Mips fast-isel
Summary:
This has most of what is needed for mips fast-isel call lowering for O32.
What is missing I will add on the next patch because this patch is already too large.
It should not be doing anything wrong but it will punt on some cases that it is basically
capable of doing.

The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues.

The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers.

However there is a way to think about this all very simply and this implementation reflects that.

Basically, the ABI rules are written as if everything is passed on the stack and aligned as such.
Once that is conceptually done, it is nearly trivial to reassign those locations to registers and
then all the complexity disappears.

So I have told tablegen that all the data is passed on the stack and during the lowering I fix
this by assigning to registers as per the ABI doc.

This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what
is going on.



Test Plan: callabi.ll

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D5714

llvm-svn: 221948
2014-11-13 23:37:45 +00:00
Reid Kleckner 9aeb04793a Fix symbol resolution of floating point libc builtins in MCJIT
Fix for LLI failure on Windows\X86: http://llvm.org/PR5053

LLI.exe crashes on Windows\X86 when single precession floating point
intrinsics like the following are used: acos, asin, atan, atan2, ceil,
copysign, cos, cosh, exp, floor, fmin, fmax, fmod, log, pow, sin, sinh,
sqrt, tan, tanh

The above intrinsics are defined as inline-expansions in math.h, and are
not exported by msvcr120.dll (Win32 API GetProcAddress returns null).

For an FREM instruction, the JIT compiler generates a call to a stub for
the fmodf() intrinsic, and adds a relocation to fixup at load time. The
loader searches the libraries for the function, but fails because the
symbol is not exported. So, the call target remains NULL and the
execution crashes.

Since the math functions are loaded at JIT/runtime, the JIT can patch
CALL instruction directly instead of the searching the libraries'
exported symbols.  However, this fix caused build failures due to
unresolved symbols like _fmodf at link time.

Therefore, the current fix defines helper functions in the Runtime
link/load library to perform the above operations.  The address of these
helper functions are used to patch up the CALL instruction at load time.

Reviewers: lhames, rnk

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5387

Patch by Swaroop Sridhar!

llvm-svn: 221947
2014-11-13 23:32:52 +00:00
Reid Kleckner 9651f01a88 Silence MSVC warning on missing return after fully covered switch
llvm-svn: 221943
2014-11-13 23:07:22 +00:00
Matt Arsenault da59f3de45 R600/SI: Fix fmin_legacy / fmax_legacy matching for SI
select_cc is expanded on SI, so this was never matched.

llvm-svn: 221941
2014-11-13 23:03:09 +00:00
Reid Kleckner 971c3ea67b Use nullptr instead of NULL for variadic sentinels
Windows defines NULL to 0, which when used as an argument to a variadic
function, is not a null pointer constant. As a result, Clang's
-Wsentinel fires on this code. Using '0' would be wrong on most 64-bit
platforms, but both MSVC and Clang make it work on Windows. Sidestep the
issue with nullptr.

llvm-svn: 221940
2014-11-13 22:55:19 +00:00
Chad Rosier 8716b58583 Revert "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE."
This reverts commit r221924.  It appears the commit was a bit premature and is causing
bot failures that need further investigation.

llvm-svn: 221939
2014-11-13 22:54:59 +00:00
Rafael Espindola f895045546 Move calls to push_back out of readAbbreviated(Literal|Field).
These functions always return a single value and not all callers want to
push them into a SmallVector.

llvm-svn: 221934
2014-11-13 22:29:02 +00:00
Reid Kleckner 4a78699c8c Avoid usage of char16_t as MSVC "14" doesn't appear to support it
Fixes the MSVC "14" build.

llvm-svn: 221932
2014-11-13 22:09:56 +00:00
Rafael Espindola 76d41f8b32 Make a few helper functions static. NFC.
llvm-svn: 221930
2014-11-13 21:54:59 +00:00
Aditya Nandakumar 3053155652 We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed.
llvm-svn: 221926
2014-11-13 21:29:21 +00:00
Chad Rosier dd526665fc [GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE.
Phabricator Revision: http://reviews.llvm.org/D6103
Patch by "Balaram Makam" <bmakam@codeaurora.org>!

llvm-svn: 221924
2014-11-13 21:17:58 +00:00
Juergen Ributzka 0af310d052 [FastISel][AArch64] Don't bail during simple GEP instruction selection.
The generic FastISel code would bail, because it can't emit a sign-extend for
AArch64. This copies the code over and uses AArch64 specific emit functions.

This is not ideal and 'computeAddress' should handles this, so it can fold the
address computation into the memory operation.

I plan to clean up 'computeAddress' anyways, so I will add that in a future
commit.

Related to rdar://problem/18962471.

llvm-svn: 221923
2014-11-13 20:50:44 +00:00
Matt Arsenault 7784992999 R600/SI: Use s_movk_i32
llvm-svn: 221922
2014-11-13 20:44:23 +00:00
Matt Arsenault 1a179e8219 R600/SI: Fix definition for s_cselect_b32
These were directly using the old base instruction
class, and specifying the wrong register classes
for operands. The operands can be the other special
inputs besides SGPRs. The op name was also being
directly used for the asm string, so this was printed
without any operands.

llvm-svn: 221921
2014-11-13 20:23:36 +00:00
Matt Arsenault 6ef66144f3 R600: Fix assert on empty function
If a function is just an unreachable, this would hit a
"this is not a MachO target" assertion because of setting
HasSubsectionViaSymbols.

llvm-svn: 221920
2014-11-13 20:07:40 +00:00
Rui Ueyama 5dcf11d177 Un-break the big-endian buildbots
llvm-svn: 221919
2014-11-13 20:07:06 +00:00
Matt Arsenault cc8d3b8774 R600: Error on initializer for LDS.
Also give a proper error for other address spaces.

llvm-svn: 221917
2014-11-13 19:56:13 +00:00
Matt Arsenault 1cffa4c191 R600/SI: Get rid of FCLAMP_SI pseudo
It's not necessary. Also use complex patterns to allow
src modifier usage.

llvm-svn: 221916
2014-11-13 19:49:04 +00:00
David Majnemer 73cc6ff54d Object, Mach-O: Refactor and clean code up
Don't assert if we can return an error code, reuse existing
functionality like is64Bit().

llvm-svn: 221915
2014-11-13 19:48:56 +00:00
Matt Arsenault 581a7a6933 R600/SI: Allow commuting with src2_modifiers
llvm-svn: 221911
2014-11-13 19:26:50 +00:00
Matt Arsenault 95e48668b6 R600/SI: Allow commuting some 3 op instructions
e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c

This simplifies matching v_madmk_f32.

This looks somewhat surprising, but it appears to be
OK to do this. We can commute src0 and src1 in all
of these instructions, and that's all that appears
to matter.

llvm-svn: 221910
2014-11-13 19:26:47 +00:00
Tim Northover 631cc9ce1a ARM: allow constpool entry to be moved to the user's block in all cases.
Normally entries can only move to a lower address, but when that wasn't viable,
the user's block was considered anyway. Unfortunately, it went via
createNewWater which wasn't designed to handle the case where there's already
an island after the block.

Unfortunately, the test we have is slow and fragile, and I couldn't reduce it
to anything sane even with the @llvm.arm.space intrinsic. The test change here
is recreating the previous one after the change.

rdar://problem/18545506

llvm-svn: 221905
2014-11-13 17:58:53 +00:00
Tim Northover ab85dcc7b8 ARM: avoid duplicating branches during constant islands.
We were using a naive heuristic to determine whether a basic block already had
an unconditional branch at the end. This mostly corresponded to reality
(assuming branches got optimised) because there's not much point in a branch to
the next block, but could go wrong.

llvm-svn: 221904
2014-11-13 17:58:51 +00:00
Tim Northover 650b0ee53b ARM: add @llvm.arm.space intrinsic for testing ConstantIslands.
Creating tests for the ConstantIslands pass is very difficult, since it depends
on precise layout details. Having the ability to precisely inject a number of
bytes into the stream helps greatly.

llvm-svn: 221903
2014-11-13 17:58:48 +00:00
Rafael Espindola 6c933979d3 Fix a regression on the disassembling C API.
The fix is easy. Unfortunately, we had 0 tests, so adding one was somewhat
complicated.

Thanks to Kevin Enderby for the report.

llvm-svn: 221899
2014-11-13 16:52:07 +00:00
Colin LeMahieu b6bbeee744 [Hexagon]
NFC Renaming reserved identifier.

llvm-svn: 221898
2014-11-13 16:36:30 +00:00
Chad Rosier 9074b18785 [Reassociate] Update comment. NFC.
llvm-svn: 221894
2014-11-13 15:40:20 +00:00
Aaron Ballman af93cd8de9 Fixing -Wtype-limits warnings with the asserts (the expression would always evaluate to true). Also fixing a -Wcast-qual warning, where the cast expression isn't required.
llvm-svn: 221888
2014-11-13 13:55:13 +00:00
Duncan P. N. Exon Smith f8edc4a4b1 IR: Create the Metadata class
This will become the root of a new class hierarchy separate from
`Value`.  As a first step, stick it between `Value` and `MDNode`.

This is part of PR21532.

llvm-svn: 221886
2014-11-13 13:17:47 +00:00
Elena Demikhovsky d5e95b57e0 AVX-512: SINT_TO_FP cost model and some bugfixes
Checked some corner cases, for example translation
of <8 x i1> to <8 x double>

llvm-svn: 221883
2014-11-13 11:46:16 +00:00
David Majnemer 94751be7fa Object, COFF: Refactor code to get relocation iterators
No functional change intended.

llvm-svn: 221880
2014-11-13 09:50:18 +00:00
Aditya Nandakumar a27193297f This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively
llvm-svn: 221878
2014-11-13 09:26:31 +00:00
Hal Finkel 45ba2c10e4 Revert r219432 - "Revert "[BasicAA] Revert "Revert r218714 - Make better use of zext and sign information."""
Let's try this again...

This reverts r219432, plus a bug fix.

Description of the bug in r219432 (by Nick):

The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).

Nick also adds this comment:

ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the  == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.

And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).

Original commit message:

This reverts r218944, which reverted r218714, plus a bug fix.

Description of the bug in r218714 (by Nick):

The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.

Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.

Original commit message:

Two related things:

 1. Fixes a bug when calculating the offset in GetLinearExpression. The code
    previously used zext to extend the offset, so negative offsets were converted
    to large positive ones.

 2. Enhance aliasGEP to deduce that, if the difference between two GEP
    allocations is positive and all the variables that govern the offset are also
    positive (i.e. the offset is strictly after the higher base pointer), then
    locations that fit in the gap between the two base pointers are NoAlias.

Patch by Nick White!

llvm-svn: 221876
2014-11-13 09:16:54 +00:00
David Majnemer e830c60d89 Object, COFF: Increase code reuse
Split getObject's smarts into checkOffset, use this to replace the
handwritten check in getSectionContents.  Similarly, replace checks in
section_rel_begin/section_rel_end with getNumberOfRelocations.

No functionality change intended.

llvm-svn: 221873
2014-11-13 08:46:37 +00:00
David Majnemer 1f80b0a8e0 Object, COFF: getRelocationSymbol shouldn't assert
lib/Object is supposed to be robust to malformed object files.  Don't
assert if we don't have a symbol table.  I'll try to come up with a test
case later.

llvm-svn: 221870
2014-11-13 07:42:11 +00:00
David Majnemer 2314b3defa Object, COFF: Cleanup some code in getSectionName
Use StringRef::startswith to tidy up some code, no functionality change
intended.

llvm-svn: 221869
2014-11-13 07:42:09 +00:00
David Majnemer 58323a9763 Object, COFF: Fix some theoretical bugs
getObject didn't consider the case where a pointer came before the start
of the object file.  No test is included, trying to come up with
something reasonable.

llvm-svn: 221868
2014-11-13 07:42:07 +00:00
Rafael Espindola c11bd4229c Read 64 bits at a time in the bitcode reader.
The reading of 64 bit values could still be optimized, but at least this cuts
down on the number of virtual calls to fetch more data.

llvm-svn: 221865
2014-11-13 07:23:22 +00:00
Chandler Carruth fee91883f4 [x86] Teach the vector shuffle lowering to make a more nuanced decision
between splitting a vector into 128-bit lanes and recombining them vs.
decomposing things into single-input shuffles and a final blend.

This handles a large number of cases in AVX1 where the cross-lane
shuffles would be much more expensive to represent even though we end up
with a fast blend at the root. Instead, we can do a better job of
shuffling in a single lane and then inserting it into the other lanes.

This fixes the remaining bits of Halide's regression captured in PR21281
for AVX1. However, the bug persists in AVX2 because I've made this
change reasonably conservative. The cases where it makes sense in AVX2
to split into 128-bit lanes are much more rare because we can often do
full permutations across all elements of the 256-bit vector. However,
the particular test case in PR21281 is an example of one of the rare
cases where it is *always* better to work in a single 128-bit lane. I'm
going to try to teach the logic to detect and form the good code even in
AVX2 next, but it will need to use a separate heuristic.

Finally, there is one pesky regression here where we previously would
craftily use vpermilps in AVX1 to shuffle both high and low halves at
the same time. We no longer pull that off, and not for any really good
reason. Ultimately, I think this is just another missing nuance to the
selection heuristic that I'll try to add in afterward, but this change
already seems strictly worth doing considering the magnitude of the
improvements in common matrix math shuffle patterns.

As always, please let me know if this causes a surprising regression for
you.

llvm-svn: 221861
2014-11-13 04:06:10 +00:00
Rui Ueyama ffa4cebe91 llvm-readobj: Print out address table when dumping COFF delay-import table
llvm-svn: 221855
2014-11-13 03:22:54 +00:00
Frederic Riss 0f7abef2cf Add an assert and a test that verify r221709's fix.
llvm-svn: 221854
2014-11-13 03:20:23 +00:00
Chandler Carruth 253dd39a9a [x86] Don't form overly fragmented blends when splitting and
re-combining shuffles because nothing was available in the wider vector
type.

The key observation (which I've put in the comments for future
maintainers) is that at this point, no further combining is really
possible. And so even though these shuffles trivially could be combined,
we need to actually do that as we produce them when producing them this
late in the lowering.

This fixes another (huge) part of the Halide vector shuffle regressions.
As it happens, this was already well covered by the tests, but I hadn't
noticed how bad some of these got. The specific patterns that turn
directly into unpckl/h patterns were occurring *many* times in common
vector processing code.

There are still more problems here sadly, but trying to incrementally
tease them apart and it looks like this is the core of the problem in
the splitting logic.

There is some chance of regression here, you can see it in the test
changes. Specifically, where we stop forming pshufb in some cases, it is
possible that pshufb was in fact faster. Intel "says" that pshufb is
slower than the instruction sequences replacing it.

llvm-svn: 221852
2014-11-13 02:42:08 +00:00
Quentin Colombet f5485bb008 [CodeGenPrepare] Handle zero extensions in the TypePromotionHelper.
Prior to this patch the TypePromotionHelper was promoting only sign extensions.
Supporting zero extensions changes:
- How constants are extended.
- How sign extensions, zero extensions, and truncate are composed together.
- How the type of the extended operation is recorded. Now we need to know the
  kind of the extension as well as its type.

Each change is fairly small, unlike the diff.
Most of the diff are comments/variable renaming to say "extension" instead of
"sign extension".

The performance improvements on the test suite are within the noise.

Related to <rdar://problem/18310086>.

llvm-svn: 221851
2014-11-13 01:44:51 +00:00
Juergen Ributzka 957a1454cc [FastISel][AArch64] Optimize select when one of the operands is a 'true' or 'false' value.
Optimize selects of i1 in the presence of 'true' and 'false' operands to simple
logic operations.

This fixes rdar://problem/18960150.

llvm-svn: 221848
2014-11-13 00:36:46 +00:00
Juergen Ributzka 424c5fd12f [FastISel][AArch64] Fold the cmp into the select when possible.
This folds the compare emission into the select emission when possible, so we
can directly use the flags and don't have to emit a separate compare.

Related to rdar://problem/18960150.

llvm-svn: 221847
2014-11-13 00:36:43 +00:00
Juergen Ributzka d1a042abd0 [FastISel][AArch64] Extend 'select' lowering to support also i1 to i16.
Related to rdar://problem/18960150.

llvm-svn: 221846
2014-11-13 00:36:38 +00:00
Frederic Riss e1f4958122 Revert "[dwarfdump] Add support for dumping accelerator tables."
This reverts commit r221836.

The tests are asserting on some buildbots. This also reverts the
test part of r221837 as it relies on dwarfdump dumping the
accelerator tables.

llvm-svn: 221842
2014-11-13 00:15:15 +00:00
Paul Robinson d9c4a9af7c Improve long path name support on Windows.
Windows normally limits the length of an absolute path name to 260
characters; directories can have lower limits.  These limits increase
to about 32K if you use absolute paths with the special '\\?\'
prefix. Teach Support\Windows\Path.inc to use that prefix as needed.

TODO: Other parts of Support could also learn to use this prefix.
llvm-svn: 221841
2014-11-13 00:12:14 +00:00
Sanjoy Das c5676df3ec Teach ScalarEvolution to sharpen range information.
If x is known to have the range [a, b), in a loop predicated by (icmp
ne x, a) its range can be sharpened to [a + 1, b).  Get
ScalarEvolution and hence IndVars to exploit this fact.

This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.

This change was originally landed in r219834 but had a bug and broke
ASan. It was reverted in r219878, and is now being re-landed after
fixing the original bug.

phabricator: http://reviews.llvm.org/D5639
reviewed by: atrick

llvm-svn: 221839
2014-11-13 00:00:58 +00:00
Frederic Riss 3a6b354b3e Fix emission of Dwarf accelerator table when there are multiple CUs.
The DIE offset in the accel tables is an offset relative to the start
of the debug_info section, but we were encoding the offset to the
start of the containing CU.

llvm-svn: 221837
2014-11-12 23:48:14 +00:00
Frederic Riss 39467276d0 [dwarfdump] Add support for dumping accelerator tables.
The class used for the dump only allows to dump for the moment, but
it can (and will) be easily extended to support search also.

llvm-svn: 221836
2014-11-12 23:48:10 +00:00
Frederic Riss e4576d2c46 Allow DWARFFormValue::extractValue to be called with a null CU.
Currently FormValues are only used for attributes of DIEs and thus
uers always have a CU lying around when calling into the FormValue
API.
Accelerator tables encode their information using the same Forms
as the attributes, thus it is natural to use DWARFFormValue to
extract/dump them. There is no CU in that case though. Allow the
API to be called with a null CU arguemnt by making the RelocMap
lookup conditional on the CU pointer validity. And document this
new behvior in the header. (Test coverage for this use of the API
comes in the DwarfAccelTable support patch)

llvm-svn: 221835
2014-11-12 23:48:04 +00:00
Frederic Riss 6cbfa91e91 Remove unsused variables.
llvm-svn: 221834
2014-11-12 23:48:01 +00:00
Ahmed Bougacha 026600d967 [CodeGenPrepare] Replace other uses of EVT::getEVT with TL::getValueType.
r221820 fixed a problem (PR21548) where an iPTR was used in TLI legality checks,
which isn't valid and resulted in a failed assertion.
The solution was to lower pointer types into the correct target's VT, by
using TL::getValueType instead of EVT::getEVT.

This commit changes 3 other uses of EVT::getEVT, but without any tests:
- One of these non-lowered EVTs is passed to allowsMisalignedMemoryAccesses,
which goes into target's TL implementation and doesn't cause any problem (yet.)
- Two others are passed to TLI.isOperationLegalOrCustom:
  - one only looks at extensions, so doesn't concern pointers.
  - one only looks at binary operators, so also isn't a problem.

The latter might some day be exposed to pointers and cause the same assert as
the original PR, because there's a comment hinting at also supporting cast ops.

For consistency, update all of them and be done with it.

llvm-svn: 221827
2014-11-12 23:05:03 +00:00
Ahmed Bougacha 0788d49a40 [CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered instead.
Fixes PR21548.  Related to PR20474.

llvm-svn: 221820
2014-11-12 22:16:55 +00:00
Sanjay Patel f6f7d5d1dd Expose the number of Newton-Raphson iterations applied to the hardware's reciprocal estimate as a parameter (x86).
This is a follow-on to r221706 and r221731 and discussed in more detail in PR21385.

This patch also loosens the testcase checking for btver2. We know that the "1.0" will be loaded, but
we can't tell exactly when, so replace the CHECK-NEXT specifiers with plain CHECKs. The CHECK-NEXT
sequence relied on a quirk of post-RA-scheduling that may change independently of anything in these tests.

llvm-svn: 221819
2014-11-12 21:39:01 +00:00
Ahmed Bougacha 55a333d89b Add fortified (__*_chk) library functions to TLI (NFC)
One of them (__memcpy_chk) was already there, the others were checked
by comparing function names.
Note that the fortified libfuncs are now part of TLI, but are always
available, because they aren't generated, only optimized into the
non-checking versions.

Differential Revision: http://reviews.llvm.org/D6179

llvm-svn: 221817
2014-11-12 21:23:34 +00:00
Timur Iskhodzhanov 0e76a16200 Temporary fix for PR21528 - use mangled C++ function names in COFF debug info to un-break ASan on Windows
llvm-svn: 221813
2014-11-12 20:21:20 +00:00
Timur Iskhodzhanov a11b32b7e5 [COFF] Make it clearer that the symbols subsection holds function display name rather than just name
llvm-svn: 221812
2014-11-12 20:10:09 +00:00
Cameron McInally 73a6bca32b [AVX512] Add integer shift by immediate intrinsics.
llvm-svn: 221811
2014-11-12 19:58:54 +00:00
Aaron Ballman 9f8d2b0995 Changing a StringRef::begin() call into StringRef::data(); NFC.
llvm-svn: 221808
2014-11-12 19:43:13 +00:00
Rafael Espindola 0c9aa57a07 Use the return of readBytes to find out if we are at the end of the stream.
This allows the removal of isObjectEnd and opens the way for reading 64 bits
at a time.

llvm-svn: 221804
2014-11-12 18:37:00 +00:00
Sanjay Patel 4c219fd248 CGSCC should not treat intrinsic calls like function calls (PR21403)
Make the handling of calls to intrinsics in CGSCC consistent: 
they are not treated like regular function calls because they
are never lowered to function calls.

Without this patch, we can get dangling pointer asserts from
the subsequent loop that processes callsites because it already
ignores intrinsics.

See http://llvm.org/bugs/show_bug.cgi?id=21403 for more details / discussion.

Differential Revision: http://reviews.llvm.org/D6124

llvm-svn: 221802
2014-11-12 18:25:47 +00:00
Jingyue Wu a41cf018b8 Fix broken doxygen annotations, NFC
llvm-svn: 221801
2014-11-12 18:25:06 +00:00
Jingyue Wu 8a12cea5f1 Disable indvar widening if arithmetics on the wider type are more expensive
Summary:
Reapply r221772. The old patch breaks the bot because the @indvar_32_bit test
was run whether NVPTX was enabled or not.

IndVarSimplify should not widen an indvar if arithmetics on the wider
indvar are more expensive than those on the narrower indvar. For
instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is
twice as expensive as that on i32, because the hardware needs to
simulate a 64-bit integer using two 32-bit integers.

Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo.

Fixes PR21148.

Test Plan:
Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics
on the wider type are more expensive. This test is run only when NVPTX is
enabled.

Reviewers: jholewinski, eliben, meheff, atrick

Reviewed By: atrick

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D6196

llvm-svn: 221799
2014-11-12 18:09:15 +00:00
Sanjay Patel 7777b50eaf remove function names from comments; NFC
llvm-svn: 221798
2014-11-12 18:07:42 +00:00
Rafael Espindola 2d05db49bb Return the number of read bytes in MemoryObject::readBytes.
Returning more information will allow BitstreamReader to be simplified a bit
and changed to read 64 bits at a time.

llvm-svn: 221794
2014-11-12 17:11:16 +00:00
Justin Hibbits a88b605721 Add support for small-model PIC for PowerPC.
Summary:
Large-model was added first.  With the addition of support for multiple PIC
models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI.  This
generates more optimal code, for shared libraries with less than about 16380
data objects.

Test Plan: Test cases added or updated

Reviewers: joerg, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, mcrosier, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D5399

llvm-svn: 221791
2014-11-12 15:16:30 +00:00
Rafael Espindola de1e5b8dfd Reduce code duplication a bit. NFC.
llvm-svn: 221785
2014-11-12 14:48:38 +00:00
Aaron Ballman 7a7b144117 Fixing a -Wcast-qual warning; NFC.
llvm-svn: 221781
2014-11-12 13:55:27 +00:00
Zoran Jovanovic fd888630b5 [mips][micromips] Add predicate 'InMicroMips' at CodeGen patterns for microMIPS instructions
Differential Revision: http://reviews.llvm.org/D6198

llvm-svn: 221780
2014-11-12 13:30:10 +00:00
Chandler Carruth 0c922fcec5 [x86] Start improving the matching of unpck instructions based on test
cases from Halide folks. This initial step was extracted from
a prototype change by Clay Wood to try and address regressions found
with Halide and the new vector shuffle lowering.

llvm-svn: 221779
2014-11-12 10:05:18 +00:00
Elena Demikhovsky be8808dc3f AVX-512: Intrinsics for ERI
3 instructions: vrcp28, vrsqrt28, vexp2, only vector forms.
Intrinsics include SAE (Suppres All Exceptions) parameter.

http://reviews.llvm.org/D6214

llvm-svn: 221774
2014-11-12 07:31:03 +00:00
Jingyue Wu a48273390c Reverts r221772 which fails tests
llvm-svn: 221773
2014-11-12 07:19:25 +00:00
Jingyue Wu 635a9b14fa Disable indvar widening if arithmetics on the wider type are more expensive
Summary:
IndVarSimplify should not widen an indvar if arithmetics on the wider
indvar are more expensive than those on the narrower indvar. For
instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is
twice as expensive as that on i32, because the hardware needs to
simulate a 64-bit integer using two 32-bit integers.

Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo.

Fixes PR21148.

Test Plan:
Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics
on the wider type are more expensive.

Reviewers: jholewinski, eliben, meheff, atrick

Reviewed By: atrick

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D6196

llvm-svn: 221772
2014-11-12 06:58:45 +00:00
Bill Schmidt 729547847f [PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics
This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.

New LLVM intrinsics are provided to represent these four instructions
in IntrinsicsPowerPC.td.  These are patterned after the similar
intrinsics for lvx and stvx (Altivec).  In PPCInstrVSX.td, these
intrinsics are tied to the code gen patterns, with additional patterns
to allow plain vanilla loads and stores to still generate these
instructions.

At -O1 and higher the intrinsics are immediately converted to loads
and stores in InstCombineCalls.cpp.  This will open up more
optimization opportunities while still allowing the correct
instructions to be generated.  (Similar code exists for aligned
Altivec loads and stores.)

The new intrinsics are added to the code that checks for consecutive
loads and stores in PPCISelLowering.cpp, as well as to
PPCTargetLowering::getTgtMemIntrinsic().

There's a new test to verify the correct instructions are generated.
The loads and stores tend to be reordered, so the test just counts
their number.  It runs at -O2, as it's not very effective to test this
at -O0, when many unnecessary loads and stores are generated.

I ended up having to modify vsx-fma-m.ll.  It turns out this test case
is slightly unreliable, but I don't know a good way to prevent
problems with it.  The xvmaddmdp instructions read and write the same
register, which is one of the multiplicands.  Commutativity allows
either to be chosen.  If the FMAs are reordered differently than
expected by the test, the register assignment can be different as a
result.  Hopefully this doesn't change often.

There is a companion patch for Clang.

llvm-svn: 221767
2014-11-12 04:19:40 +00:00
Rafael Espindola 79e1f9ff99 Merge StreamableMemoryObject into MemoryObject.
Every MemoryObject is a StreamableMemoryObject since the removal of
StringRefMemoryObject, so just merge the two.

I will clean up the MemoryObject interface in the upcoming commits.

llvm-svn: 221766
2014-11-12 03:55:46 +00:00
Rafael Espindola 5b7bb884e6 Remove unused method. NFC.
llvm-svn: 221759
2014-11-12 02:35:31 +00:00
Rafael Espindola 5468ded718 Make readBytes pure virtual. Every real implementation has it.
llvm-svn: 221758
2014-11-12 02:30:38 +00:00
Rafael Espindola ef0482f50a Remove unused method. NFC.
llvm-svn: 221757
2014-11-12 02:27:40 +00:00
Rafael Espindola cac0088e91 Remove the now unused StringRefMemoryObject.h.
llvm-svn: 221755
2014-11-12 02:13:27 +00:00
Rafael Espindola 7fc5b87480 Pass an ArrayRef to MCDisassembler::getInstruction.
With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t>
instead of a MemoryObject.

Even on X86 there is a maximum size an instruction can have. Given
that, it seems way simpler and more efficient to just pass an ArrayRef
to the disassembler instead of a MemoryObject and have it do a virtual
call every time it wants some extra bytes.

llvm-svn: 221751
2014-11-12 02:04:27 +00:00
Nick Kledzik f44dbda542 Object, support both mach-o archive t.o.c file names
For historical reasons archives on mach-o have two possible names for the 
file containing the table of contents for the archive: "__.SYMDEF SORTED" 
and "__.SYMDEF".  But the libObject archive reader only supported the former.

This patch fixes llvm::object::Archive to support both names.

llvm-svn: 221747
2014-11-12 01:37:45 +00:00
Rafael Espindola 35a12a85a1 Remove a bit of dead code.
Every "real" object file implements this an ptx doesn't use it.

llvm-svn: 221746
2014-11-12 01:27:22 +00:00
Philip Reames 319c48eb2d Extend intrinsic name mangling to support arrays, named structs, and function types.
Currently, we have a type parameter mechanism for intrinsics. Rather than having to specify a separate intrinsic for each combination of argument and return types, we can specify a single intrinsic with one or more type parameters. These type parameters are passed explicitly to Intrinsic::getDeclaration or can be specified implicitly in the naming of the intrinsic function in an LL file.

Today, the types are limited to integer, floating point, and pointer types. With a goal of supporting symbolic targets for patchpoints and statepoints, this change adds support for function types.  The change also includes support for first class aggregate types (named structures and arrays) since these appear in function types we've encountered.  

Reviewed by: atrick, ributzka
Differential Revision: http://reviews.llvm.org/D4608

llvm-svn: 221742
2014-11-12 00:21:51 +00:00
Chad Rosier f53f07046b [Reassociate] Canonicalize negative constants out of expressions.
Add support for FDiv, which was regressed by the previous commit.

llvm-svn: 221738
2014-11-11 23:36:42 +00:00
Philip Reames 66c6de61ee Canonicalize an assume(load != null) into !nonnull metadata
We currently have two ways of informing the optimizer that the result of a load is never null: metadata and assume. This change converts the second in to the former. This avoids a need to implement optimizations using both forms.

We should probably extend this basic idea to metadata of other forms; in particular, range metadata. We view is that assumes should be considered a "last resort" for when there isn't a more canonical way to represent something.

Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D5951

llvm-svn: 221737
2014-11-11 23:33:19 +00:00
Sanjay Patel 50fc6ff5e3 Initialize new subtarget feature variable for generating reciprocal estimate instructions.
This was missed in r221706.

llvm-svn: 221731
2014-11-11 23:13:15 +00:00
Duncan P. N. Exon Smith 9419863909 libLTO: Assert if LTOCodeGenerator and LTOModule are from different contexts
llvm-svn: 221730
2014-11-11 23:13:10 +00:00
Juergen Ributzka 89441b0dd8 [FastISel][AArch64] Add support for fabs intrinsic.
Lower the llvm.fabs intrinsic to the 'fabs' MI instruction.

This fixes rdar://problem/18946552.

llvm-svn: 221729
2014-11-11 23:10:44 +00:00
Duncan P. N. Exon Smith 97b45874bf libLTO: Allow LTOModule to own a context
llvm-svn: 221728
2014-11-11 23:08:05 +00:00
Duncan P. N. Exon Smith de5e32b5b4 libLTO: Allow LTOCodeGenerator to own a context
llvm-svn: 221726
2014-11-11 23:03:29 +00:00
Kostya Serebryany 231bd088d8 [asan] adding ShadowOffset64 for mips64, patch by Kumar Sukhani
llvm-svn: 221725
2014-11-11 23:02:57 +00:00
Chad Rosier 094ac7735b [Reassociate] Canonicalize negative constants out of expressions.
This is a reapplication of r221171, but we only perform the transformation
on expressions which include a multiplication.  We do not transform rem/div
operations as this doesn't appear to be safe in all cases.

llvm-svn: 221721
2014-11-11 22:58:35 +00:00
Kostya Serebryany 29a18dcbc5 Move asan-coverage into a separate phase.
Summary:
This change moves asan-coverage instrumentation
into a separate Module pass.
The other part of the change in clang introduces a new flag
-fsanitize-coverage=N.
Another small patch will update tests in compiler-rt.

With this patch no functionality change is expected except for the flag name.
The following changes will make the coverage instrumentation work with tsan/msan

Test Plan: Run regression tests, chromium.

Reviewers: nlewycky, samsonov

Reviewed By: nlewycky, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6152

llvm-svn: 221718
2014-11-11 22:14:37 +00:00
Duncan P. N. Exon Smith de36e8040f Revert "IR: MDNode => Value"
Instead, we're going to separate metadata from the Value hierarchy.  See
PR21532.

This reverts commit r221375.
This reverts commit r221373.
This reverts commit r221359.
This reverts commit r221167.
This reverts commit r221027.
This reverts commit r221024.
This reverts commit r221023.
This reverts commit r220995.
This reverts commit r220994.

llvm-svn: 221711
2014-11-11 21:30:22 +00:00
Tom Roeder 6312f4a422 Fix build break: remove unused variable in FCFI.
llvm-svn: 221710
2014-11-11 21:26:33 +00:00
Frederic Riss 8ad4f498fb Totally forget deallocated SDNodes in SDDbgInfo.
What would happen before that commit is that the SDDbgValues associated with
a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep
a map entry keyed by the SDNode pointer pointing to this list of invalidated
SDDbgNodes. As the memory gets reused, the list might get wrongly associated
with another new SDNode. As the SDDbgValues are cloned when they are transfered,
this can lead to an exponential number of SDDbgValues being produced during
DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893

Note that the previous behavior wasn't really buggy as the invalidation made
sure that the SDDbgValues won't be used. This commit can be considered a
memory optimization and as such is really hard to validate in a unit-test.

llvm-svn: 221709
2014-11-11 21:21:08 +00:00
Tom Roeder eb7a303d1b Add Forward Control-Flow Integrity.
This commit adds a new pass that can inject checks before indirect calls to
make sure that these calls target known locations. It supports three types of
checks and, at compile time, it can take the name of a custom function to call
when an indirect call check fails. The default failure function ignores the
error and continues.

This pass incidentally moves the function JumpInstrTables::transformType from
private to public and makes it static (with a new argument that specifies the
table type to use); this is so that the CFI code can transform function types
at call sites to determine which jump-instruction table to use for the check at
that site.

Also, this removes support for jumptables in ARM, pending further performance
analysis and discussion.

Review: http://reviews.llvm.org/D4167
llvm-svn: 221708
2014-11-11 21:08:02 +00:00
Sanjay Patel e2e589288f Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).
This is a first step for generating SSE rcp instructions for reciprocal
calcs when fast-math allows it. This is very similar to the rsqrt optimization
enabled in D5658 ( http://reviews.llvm.org/rL220570 ).

For now, be conservative and only enable this for AMD btver2 where performance
improves significantly both in terms of latency and throughput.

We may never enable this codegen for Intel Core* chips because the divider circuits
are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21
cycle critical path for the rcp + mul + sub + mul + add estimate.

Follow-on patches may allow configuration of the number of Newton-Raphson refinement
steps, add AVX512 support, and enable the optimization for more chips.

More background here: http://llvm.org/bugs/show_bug.cgi?id=21385

Differential Revision: http://reviews.llvm.org/D6175

llvm-svn: 221706
2014-11-11 20:51:00 +00:00
Bill Schmidt 3d9674cfb1 [PowerPC] Replace foul hackery with real calls to __tls_get_addr
My original support for the general dynamic and local dynamic TLS
models contained some fairly obtuse hacks to generate calls to
__tls_get_addr when lowering a TargetGlobalAddress.  Rather than
generating real calls, special GET_TLS_ADDR nodes were used to wrap
the calls and only reveal them at assembly time.  I attempted to
provide correct parameter and return values by chaining CopyToReg and
CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not
fully correct.  Problems were seen with two back-to-back stores to TLS
variables, where the call sequences ended up overlapping with unhappy
results.  Additionally, since these weren't real calls, the proper
register side effects of a call were not recorded, so clobbered values
were kept live across the calls.

The proper thing to do is to lower these into calls in the first
place.  This is relatively straightforward; see the changes to
PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp.
The changes here are standard call lowering, except that we need to
track the fact that these calls will require a relocation.  This is
done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the
TargetGlobalAddress operand that appears earlier in the sequence.

The calls to LowerCallTo() eventually find their way to
LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(),
which calls PrepareCall().  In PrepareCall(), we detect the calls to
__tls_get_addr and immediately snag the TargetGlobalTLSAddress with
the annotated relocation information.  This becomes an extra operand
on the call following the callee, which is expected for nodes of type
tlscall.  We change the call opcode to CALL_TLS for this case.  Back
in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only,
since we require a TOC-restore nop following the call for the 64-bit
ABIs.

During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td
convert the CALL_TLS nodes into BL_TLS nodes, and convert the
CALL_NOP_TLS nodes into BL8_NOP_TLS nodes.  This replaces the code
removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS
nodes can now be emitted normally using their patterns and the
associated printTLSCall print method.

Finally, as a result of these changes, all references to get-tls-addr
in its various guises are no longer used, so they have been removed.

There are existing TLS tests to verify the changes haven't messed
anything up).  I've added one new test that verifies that the problem
with the original code has been fixed.

llvm-svn: 221703
2014-11-11 20:44:09 +00:00
Rafael Espindola a9c28b68cd Use a 8 bit immediate when possible.
This fixes pr21529.

llvm-svn: 221700
2014-11-11 19:46:36 +00:00
Dario Domizioli e904e85faf [X86][ELF] Fix PR20243 - leaf frame pointer bug with TLS access
The ISel lowering for global TLS access in PIC mode was creating a pseudo 
instruction that is later expanded to a call, but the code was not 
setting the hasCalls flag in the MachineFrameInfo alongside the adjustsStack 
flag. This caused some functions to be mistakenly recognized as leaf functions,
and this in turn affected the decision to eliminate the frame pointer.

With the fix, hasCalls is properly set and the leaf frame pointer is correctly
preserved.

llvm-svn: 221695
2014-11-11 18:44:49 +00:00
Oliver Stannard 8c2c67e63c LLVM incorrectly folds xor into select
LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with
(set_cc !cc x y), which is only correct when the xor has type i1.
Instead, we should check that the constant operand to the xor is all
ones.

llvm-svn: 221693
2014-11-11 17:36:01 +00:00
Vasileios Kalintiris b2dd15f8c7 [mips] Add preliminary support for the MIPS II target.
Summary:
This patch enables code generation for the MIPS II target. Pre-Mips32
targets don't have the MUL instruction, so we add the correspondent
pattern that uses the MULT/MFLO combination in order to retrieve the
product.

This is WIP as we don't support code generation for select nodes due to
the lack of conditional-move instructions.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6150

llvm-svn: 221686
2014-11-11 11:43:55 +00:00
Vasileios Kalintiris 8c1c95e95c [mips] Add hardware register name "hwr_ulr" ($29)
The canonical name when printing assembly is still $29. The reason is that
GAS does not accept "$hwr_ulr" at the moment.

This addresses the comments from r221307, which reverted the original
commit r221299.

llvm-svn: 221685
2014-11-11 11:22:39 +00:00
Andrea Di Biagio 5fa2e15453 [X86] Add missing check for 'isINSERTPSMask' in method 'isShuffleMaskLegal'.
This helps the DAGCombiner to identify more opportunities to fold shuffles.

llvm-svn: 221684
2014-11-11 11:20:31 +00:00
Vasileios Kalintiris 10b5ba3f6e Recommit "[mips] Add names and tests for the hardware registers"
The original commit r221299 was reverted in r221307.  I removed the name
"hrw_ulr" ($29) from the original commit because two tests were failing.

llvm-svn: 221681
2014-11-11 10:31:31 +00:00
David Majnemer 2cc4bc77bf MC, COFF: Use relocations for function references inside the section
Referencing one symbol from another in the same section does not
generally require a relocation.  However, the MS linker has a feature
called /INCREMENTAL which enables incremental links.  It achieves this
by creating thunks to the actual function and redirecting all
relocations to point to the thunk.

This breaks down with the old scheme if you have a function which
references, say, itself.  On x86_64, we would use %rip relative
addressing to reference the start of the function from out current
position.  This would lead to miscompiles because other references might
reference the thunk instead, breaking function pointer equality.

This fixes PR21520.

llvm-svn: 221678
2014-11-11 08:43:57 +00:00
Craig Topper f655cddb13 Use uint64_t as the type for the X86 TSFlag format enum. Allows removal of the VEXShift hack that was used to access the higher bits of TSFlags.
llvm-svn: 221673
2014-11-11 07:32:32 +00:00
Michael Kuperstein 3fe15e498f [X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext
This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details.
Recommitting - This time, with a hopefully working test.

Differential Revision: http://reviews.llvm.org/D6128

llvm-svn: 221672
2014-11-11 07:07:40 +00:00
Jingyue Wu dfd4eb9285 [NVPTX] Remove dead code in NVPTXTargetTransformInfo (NFC)
llvm-svn: 221668
2014-11-11 05:24:04 +00:00
Rafael Espindola 961d469445 MCAsmParserExtension has a copy of the MCAsmParser. Use it.
Base classes were storing a second copy.

llvm-svn: 221667
2014-11-11 05:18:41 +00:00
Rafael Espindola 804f43c655 Add const. NFC.
This adds const to a few methods that already return const references or
creates a const version when they reterun non-const references.

llvm-svn: 221666
2014-11-11 05:11:47 +00:00
Quentin Colombet 360460ba64 [X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if
AVX2 is available.
According to IACA, the new lowering has a throughput of 8 cycles instead of 13
with the previous one.

Althought this lowering kicks in some SPECs benchmarks, the performance
improvement was within the noise.

Correctness testing has been done for the whole range of uint32_t with the
following program:
    uint4 v = (uint4) {0,1,2,3};
    uint32_t i;
    
    //Check correctness over entire range for uint4 -> float4 conversion
    for( i = 0; i < 1U << (32-2); i++ )
    {
        float4 t = test(v);
        float4 c = correct(v);
        
        if( 0xf != _mm_movemask_ps( t == c ))
        {
            printf( "Error @ %vx: %vf vs. %vf\n", v, c, t);
            return -1;
        }
        
        v += 4;
    }
Where "correct" is the old lowering and "test" the new one.

The patch adds a test case for the two custom lowering instruction.
It also modifies the vector cost model, which is why cast.ll and uitofp.ll are
modified.
2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7
instructions instead of 4 (3 more constant loads).

rdar://problem/18153096>

llvm-svn: 221657
2014-11-11 02:23:47 +00:00
Nico Weber 7f654a8e8f speling.
llvm-svn: 221652
2014-11-11 01:13:42 +00:00
Chad Rosier a9ae3e311c [yaml2obj] Support AArch64 relocations.
Patch by Daniel Stewart <stewartd@codeaurora.org>!
Phabricator Revision: http://reviews.llvm.org/D6192

llvm-svn: 221639
2014-11-10 23:02:03 +00:00
Michael Kuperstein 217e1eec0d Reverting r221626 due to a too-strict test.
llvm-svn: 221629
2014-11-10 21:07:41 +00:00
Juergen Ributzka ea5870a530 [AArch64][FastISel] Fix kill flags for integer extends.
In the case we optimize an integer extend away and replace it directly with the
source register, we also have to clear all kill flags at all its uses.
This is necessary, because the orignal IR instruction might be trivially dead,
but we replaced it with a nop at MI level.

llvm-svn: 221628
2014-11-10 21:05:31 +00:00
Juergen Ributzka d441725d3d [SwitchLowering] Fix the "fixPhis" function.
Switch statements may have more than one incoming edge into the same BB if they
all have the same value. When the switch statement is converted these incoming
edges are now coming from multiple BBs. Updating all incoming values to be from
a single BB is incorrect and would generate invalid LLVM IR.

The fix is to only update the first occurrence of an incoming value. Switch
lowering will perform subsequent calls to this helper function for each incoming
edge with a new basic block - updating all edges in the process.

This fixes rdar://problem/18916275.

llvm-svn: 221627
2014-11-10 21:05:27 +00:00
Michael Kuperstein 3218b942f4 [X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext
This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits.
See PR20494 for details.

Differential Revision: http://reviews.llvm.org/D6128

llvm-svn: 221626
2014-11-10 20:40:21 +00:00
Rafael Espindola 75b809c9b6 Copy externally_initialized in GlobalVariable::copyAttributesFrom.
Patch by Kevin Frei!

llvm-svn: 221620
2014-11-10 18:41:59 +00:00
Jingyue Wu 0c981bd7df [NVPTX] Add an NVPTX-specific TargetTransformInfo
Summary:
It currently only implements hasBranchDivergence, and will be extended
in later diffs.

Split from D6188.

Test Plan: make check-all

Reviewers: jholewinski

Reviewed By: jholewinski

Subscribers: llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D6195

llvm-svn: 221619
2014-11-10 18:38:25 +00:00
Rafael Espindola 4aa6bea7a2 Misc style fixes. NFC.
This fixes a few cases of:

* Wrong variable name style.
* Lines longer than 80 columns.
* Repeated names in comments.
* clang-format of the above.

This make the next patch a lot easier to read.

llvm-svn: 221615
2014-11-10 18:11:10 +00:00
Vasileios Kalintiris ccde2a9a1e Fix extra semicolon warning. NFC.
llvm-svn: 221613
2014-11-10 17:37:53 +00:00
Zoran Jovanovic 37bca10148 [mips][microMIPS] Fix issue with delay slot filler and microMIPS
Differential Revision: http://reviews.llvm.org/D6193

llvm-svn: 221612
2014-11-10 17:27:56 +00:00
Daniel Sanders 87f9b88bfb [mips] Fix sret arguments for N32/N64 which were accidentally broken in r221534.
llvm-svn: 221604
2014-11-10 15:57:53 +00:00
Saleem Abdulrasool d2c5d7f6da Transforms: address some late comments
We already use the llvm namespace.  Remove the unnecessary prefix.  Use the
StringRef::equals method to compare with C strings rather than instantiating
std::strings.

Addresses late review comments from David Majnemer.

llvm-svn: 221564
2014-11-08 00:00:50 +00:00
Saleem Abdulrasool 92b13aac04 Transforms: sort source files in build
Sort target sources.  NFC.

llvm-svn: 221563
2014-11-08 00:00:47 +00:00
Chad Rosier b3eb452e83 [Reassociate] Better preserve NSW/NUW flags.
Part of PR12985.

Phabricator Revision: http://reviews.llvm.org/D6172

llvm-svn: 221555
2014-11-07 22:12:57 +00:00
Saleem Abdulrasool 89c5ad4cda Transforms: use typedef rather than using aliases
Visual Studio 2012 apparently does not support using alias declarations.  Use
the more traditional typedef approach.  This should let the Windows buildbots
pass.  NFC.

llvm-svn: 221554
2014-11-07 22:09:52 +00:00
Saleem Abdulrasool 5898e09057 Transform: add SymbolRewriter pass
This introduces the symbol rewriter. This is an IR->IR transformation that is
implemented as a CodeGenPrepare pass. This allows for the transparent
adjustment of the symbols during compilation.

It provides a clean, simple, elegant solution for symbol inter-positioning. This
technique is often used, such as in the various sanitizers and performance
analysis.

The control of this is via a custom YAML syntax map file that indicates source
to destination mapping, so as to avoid having the compiler to know the exact
details of the source to destination transformations.

llvm-svn: 221548
2014-11-07 21:32:08 +00:00
Michael J. Spencer d48829b958 Fix style.
llvm-svn: 221547
2014-11-07 21:30:36 +00:00
Matt Arsenault 23755997e4 R600: Remove unused define
llvm-svn: 221543
2014-11-07 20:45:00 +00:00
Daniel Sanders c43cda84ff [mips] Promote i32 arguments to i64 for the N32/N64 ABI and fix <64-bit structs...
Summary:
... and after all that refactoring, it's possible to distinguish softfloat
floating point values from integers so this patch no longer breaks softfloat to
do it.

Remove direct handling of i32's in the N32/N64 ABI by promoting them to
i64. This more closely reflects the ABI documentation and also fixes
problems with stack arguments on big-endian targets.

We now rely on signext/zeroext annotations (already generated by clang) and
the Assert[SZ]ext nodes to avoid the introduction of unnecessary sign/zero
extends.

It was not possible to convert three tests to use signext/zeroext. These tests
are bswap.ll, ctlz-v.ll, ctlz-v.ll. It's not possible to put signext on a
vector type so we just accept the sign extends here for now. These tests don't
pass the vectors the same way clang does (clang puts multiple elements in the
same argument, these map 1 element to 1 argument) so we don't need to worry too
much about it.

With this patch, all known N32/N64 bugs should be fixed and we now pass the
first 10,000 tests generated by ABITest.py.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6117

llvm-svn: 221534
2014-11-07 16:54:21 +00:00
NAKAMURA Takumi 13437e8414 [CMake] LLVMSupport: Give system_libs PRIVATE scope when LLVMSupport is built as SHARED. Users of LLVMSupport won't inherit ${system_libs}.
unittests/SupporTests is another user of libpthreads. Apply LLVM_SYSTEM_LIBS for him explicitly.

llvm-svn: 221531
2014-11-07 16:08:19 +00:00
Daniel Sanders b315c8c762 [mips] Removed the remainder of MipsCC. NFC.
Summary:
One of the calls to AllocateStack (the one in LowerCall) doesn't look like
it should be there but it was there before and removing it breaks the
frame size calculation.

Reviewers: vmedic, theraven

Reviewed By: theraven

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6116

llvm-svn: 221529
2014-11-07 15:33:08 +00:00
Daniel Sanders 2c6f4b430b [mips] Remove MipsCC::reservedArgArea() in favour of MipsABIInfo::GetCalleeAllocdArgSizeInBytes(). NFC.
Summary:

Reviewers: theraven, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6115

llvm-svn: 221528
2014-11-07 15:03:53 +00:00
NAKAMURA Takumi 0ebd071450 MipsCCState.h: Use LLVM_DELETED_FUNCTION for msc17.
llvm-svn: 221527
2014-11-07 14:56:31 +00:00
Daniel Sanders 0456c15c58 [mips] Move MipsCCState to a separate file and clang-formatted it.
Summary: Depends on D6113

Reviewers: theraven, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6114

llvm-svn: 221525
2014-11-07 14:24:31 +00:00
Daniel Sanders 892cf8af46 [mips] Fix unused variable warnings introduced in r221521
llvm-svn: 221522
2014-11-07 12:43:01 +00:00
Daniel Sanders d7eba31508 [mips] Remove remaining use of MipsCC::intArgRegs() in favour of MipsABIInfo::GetByValArgRegs() and MipsABIInfo::GetVarArgRegs()
Summary: Depends on D6112

Reviewers: theraven, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6113

llvm-svn: 221521
2014-11-07 12:21:37 +00:00
Daniel Sanders 4f1bedaa47 [mips] Remove MipsCC::getRegVT(). NFC
Summary: It's no longer used.

Reviewers: vmedic, theraven

Reviewed By: theraven

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6112

llvm-svn: 221519
2014-11-07 12:02:59 +00:00
Daniel Sanders cfad1e3fca [mips] Remove MipsCC::analyzeCallOperands in favour of CCState::AnalyzeCallOperands. NFC
Summary:
In addition to the usual f128 workaround, it was also necessary to provide
a means of accessing ArgListEntry::IsFixed.

Reviewers: theraven, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6111

llvm-svn: 221518
2014-11-07 11:43:49 +00:00
Daniel Sanders 41a64c407f [mips] Move SpecialCallingConv to MipsCCState and use it from tablegen-erated code. NFC
Summary:
In the long run, it should probably become a calling convention in its own
right but for now just move it out of
MipsISelLowering::analyzeCallOperands() so that we can drop this function
in favour of CCState::AnalyzeCallOperands().

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6085

llvm-svn: 221517
2014-11-07 11:10:48 +00:00
Daniel Sanders f3096a1c8d [mips] Removed IsVarArg from MipsISelLowering::analyzeCallOperands(). NFC.
Summary:
CCState objects already carry this information in their isVarArg() method.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6084

llvm-svn: 221516
2014-11-07 10:45:16 +00:00
David Majnemer 2098b86f64 SCCP: overdefined calls cannot become constant
We would attempt to fold away a call instruction which had been marked
overdefined.  However, it's not valid to transition to constant from
overdefined.

This fixes PR21512.

llvm-svn: 221513
2014-11-07 08:54:19 +00:00
Justin Hibbits 771c132e0f Add Position-independent Code model Module API.
Summary:
This makes PIC levels a Module flag attribute, which can be queried by the
backend.  The flag is named `PIC Level`, and can have a value of:

  0 - Backend-default
  1 - Small-model (-fpic)
  2 - Large-model (-fPIC)

These match the `-pic-level' command line argument for clang, and the value of the
preprocessor macro `__PIC__'.

Test Plan:
New flags tests specific for the 'PIC Level' module flag.
Tests to be added as part of a future commit for PowerPC, which will use this new API.

Reviewers: rafael, echristo

Reviewed By: rafael, echristo

Subscribers: rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D5882

llvm-svn: 221510
2014-11-07 04:46:10 +00:00
Ahmed Bougacha 72001cf287 [AArch64] Keep flags on condition vreg when instantiating a CB branch.
Reversing a CB* instruction used to drop the flags on the condition. On the
included testcase, this lead to a read from an undefined vreg.
Using addOperand keeps the flags, here <undef>.

Differential Revision: http://reviews.llvm.org/D6159

llvm-svn: 221507
2014-11-07 02:50:00 +00:00
Rafael Espindola 1d7d4eb15c Use a StringRefMemoryObject. NFC.
llvm-svn: 221503
2014-11-07 01:09:51 +00:00
David Majnemer bf93e7c7d3 LoopVectorize: Don't assume pointees are sized
A pointer's pointee might not be sized: the pointee could be a function.

Report this as IK_NoInduction when calculating isInductionVariable.

This fixes PR21508.

llvm-svn: 221501
2014-11-07 00:31:14 +00:00
David Majnemer c1eca5ad7c InstCombine: Rely on cmpxchg's return code when it's strong
Comparing the result of a cmpxchg instruction can be replaced with an
extractvalue of the cmpxchg success indicator.

llvm-svn: 221498
2014-11-06 23:23:30 +00:00
Rafael Espindola 89cb407729 Remove unused variable. NFC.
llvm-svn: 221497
2014-11-06 23:16:57 +00:00
Simon Atanasyan 60e1a79242 [ELF][yaml2obj] Handle additional MIPS specific st_other field flags
The ELF symbol `st_other` field might contain additional flags besides
visibility ones. This patch implements support for some MIPS specific
flags.

llvm-svn: 221491
2014-11-06 22:46:24 +00:00
Rafael Espindola e2541bd60e Factor out call to push_back. NFC.
llvm-svn: 221490
2014-11-06 22:39:16 +00:00
Simon Pilgrim 615ab8e721 [X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq)
Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions.

Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables.

Differential Revision: http://reviews.llvm.org/D6001

llvm-svn: 221489
2014-11-06 22:15:41 +00:00
Ahmed Bougacha b5367eeea3 [X86] Add VFMADDSUB cases for the 213->231 custom inserter.
Also add tests for vfmadd/vfmsub.

llvm-svn: 221488
2014-11-06 22:04:15 +00:00
Ahmed Bougacha 9152361d73 [X86] Add missing FMA3 VFMADDSUB in the emitter.
Also reuse the fma4 intrinsic test to cover fma3 instructions too.

llvm-svn: 221487
2014-11-06 21:58:11 +00:00
David Majnemer 504165df71 Object, COFF: Don't consider AuxFunctionDefinition for getSymbolSize
mingw lies about the size of a function's AuxFunctionDefinition.  Ignore
the field and rely on our heuristic to determine the symbol's size.

llvm-svn: 221485
2014-11-06 21:46:55 +00:00
Rafael Espindola b7a4505a3f Base check on the section name, not the variable name.
The variable is private, so the name should not be relied on. Also, the
linker uses the sections, so asan should too when trying to avoid causing
the linker problems.

llvm-svn: 221480
2014-11-06 20:01:34 +00:00
Lang Hames cdd9077f3a [RegAlloc] Kill off the trivial spiller - nobody is using it any more.
llvm-svn: 221474
2014-11-06 19:12:38 +00:00
Michael Liao 736bac6482 Indentation fixes
llvm-svn: 221472
2014-11-06 19:05:57 +00:00
Frederic Riss 051cd75b6d Try to appease MSVC buildbots after r221466.
llvm-svn: 221471
2014-11-06 19:00:47 +00:00
Frederic Riss 4aa51ae6c9 Change DIBuilder::createImportedDeclaration from taking a DIScope to a DIDescriptor.
Imported declarations can be DIGlobalVariables which aren't a DIScope. Today
clang (unknowingly I believe) shoehorns these into a DIScope and it all works
just because we never access the fields.

llvm-svn: 221466
2014-11-06 17:46:55 +00:00
Colin LeMahieu 2c769209a1 [Hexagon] Adding basic Hexagon ELF object emitter.
llvm-svn: 221465
2014-11-06 17:05:51 +00:00
Eli Bendersky 799c564236 Clean up NVPTXLowerStructArgs.cpp. NFC
* Remove unnecessary const_casts and C-style casts
* Simplify attribute access code
* Simplify ArrayRef creation
* 80-col and clang-format

llvm-svn: 221464
2014-11-06 17:05:49 +00:00
Daniel Sanders 2373af3475 [mips] Removed IsSoftFloat from MipsISelLowering::analyzeCallOperands(). NFC
Summary:
It isn't used anymore.

Depends on D6081

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6083

llvm-svn: 221463
2014-11-06 16:48:57 +00:00
Chad Rosier ac6a2f532c [Reassociate] Don't reassociate when mixing regular and fast-math FP
instructions.  Inlining might cause such cases and it's not valid to
reassociate floating-point instructions without the unsafe algebra flag.

Patch by Mehdi Amini <mehdi_amini@apple.com>!

llvm-svn: 221462
2014-11-06 16:46:37 +00:00
Daniel Sanders b70e27ca7b [mips] Removed MipsISelLowering::analyzeFormalArguments() in favour of CCState::AnalyzeFormalArguments()
Summary:
As with returns, we must be able to identify f128 arguments despite them
being lowered away. We do this with a pre-analyze step that builds a
vector and then we use this vector from the tablegen-erated code.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6081

llvm-svn: 221461
2014-11-06 16:36:30 +00:00
Rafael Espindola 26cfbea738 Compute the correct jump table entries on 32 bit windows.
On 32 bit windows we use label differences and .set does not suppress
rolocations, a combination that was not used before r220256.

This fixes PR21497.

llvm-svn: 221456
2014-11-06 14:39:49 +00:00
Andrea Di Biagio 7ecd22ca4a [X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8.
Example:
define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) {
  %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
  ret <4 x i32> %shuffle
}

Before llc (-mattr=+sse4.1), produced the following assembly instruction:
  pblendw $4294967103, %xmm1, %xmm0

After
  pblendw $63, %xmm1, %xmm0

llvm-svn: 221455
2014-11-06 14:36:45 +00:00
Aaron Ballman e77ffe35bf Fixing some -Wcast-qual warnings; NFC.
llvm-svn: 221454
2014-11-06 14:32:30 +00:00
Toma Tabacu 27cab751ca [mips] Tolerate the use of the %z inline asm operand modifier with non-immediates.
Summary:
Currently, we give an error if %z is used with non-immediates, instead of continuing as if the %z isn't there.

For example, you use the %z operand modifier along with the "Jr" constraints ("r" makes the operand a register, and "J" makes it an immediate, but only if its value is 0). 
In this case, you want the compiler to print "$0" if the inline asm input operand turns out to be an immediate zero and you want it to print the register containing the operand, if it's not.

We give an error in the latter case, and we shouldn't (GCC also doesn't).

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6023

llvm-svn: 221453
2014-11-06 14:25:42 +00:00
Sasa Stankovic b38db1eff8 [mips] Add the following MIPS options that control gp-relative addressing of
small data items: -mgpopt, -mlocal-sdata, -mextern-sdata. Implement gp-relative
addressing for constants.

Differential Revision: http://reviews.llvm.org/D4903

llvm-svn: 221450
2014-11-06 13:20:12 +00:00
Toma Tabacu dde4c464dd [mips] Improve error/warning messages and testing for the .cpload assembler directive.
Summary:
Improved warning message when using .cpload inside a reorder section and added an error message for using .cpload with Mips16 enabled.
Modified the tests to fit with the changes mentioned above, added a test-case for the N32 ABI in cpload.s and did some reformatting to make the tests easier to read.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5465

llvm-svn: 221447
2014-11-06 10:02:45 +00:00
Daniel Sanders 66e799ff1b [JIT] Fix more missing endian conversions (opcodes for AArch64, ARM, and Mips stub functions, and ARM target in general)
Summary:
Fixed all of the missing endian conversions that Lang Hames and I identified in
RuntimeDyldMachOARM.h.

Fixed the opcode emission in RuntimeDyldImpl::createStubFunction() for AArch64,
ARM, Mips when the host endian doesn't match the target endian.
PowerPC will need changing if it's opcodes are affected by endianness but I've
left this for now since I'm unsure if this is the case and it's the only path
that specifies the target endian.

This patch fixes MachO_ARM_PIC_relocations.s on a big-endian Mips host. This
is the last of the known issues on this host.

Reviewers: lhames

Reviewed By: lhames

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6130

llvm-svn: 221446
2014-11-06 09:53:05 +00:00
David Majnemer 51ff559500 Object, COFF: Infer symbol sizes from adjacent symbols
Use the position of the subsequent symbol in the object file to infer
the size of it's predecessor.  I hope to eventually remove whatever COFF
specific details from this little algorithm so that we can unify this
logic with what Mach-O does.

llvm-svn: 221444
2014-11-06 08:10:41 +00:00
David Majnemer 03d2c51cf2 X86, MC: Tidy up some whitespace in GetRelocType
No functionality change intended.

llvm-svn: 221443
2014-11-06 08:10:37 +00:00
Justin Bogner 58e41344f9 GCOV: Make sure that function idents in the .gcda and .gcno match
When generating gcov compatible profiling, we sometimes skip emitting
data for functions for one reason or another. However, this was
emitting different function IDs in the .gcno and .gcda files, because
the .gcno case was using the loop index before skipping functions and
the .gcda the array index after. This resulted in completely invalid
gcov data.

This fixes the problem by making the .gcno loop track the ID
separately from the loop index.

llvm-svn: 221441
2014-11-06 06:55:02 +00:00
Rafael Espindola 83f0ea8982 Add three other sections when L symbols are allowed.
llvm-svn: 221436
2014-11-06 05:01:21 +00:00
Rafael Espindola bf77ed6826 Allow L symbols in no_dead_strip sections.
If a section cannot be dead stripped, it is safe to use L symbols, since
the linker will keep all of it in the end.

llvm-svn: 221431
2014-11-06 02:42:03 +00:00
Quentin Colombet dbe33e7aa4 [X86] Lower VSELECT into SHRUNKBLEND when we shrink the bits used into the
condition to match a blend.
This prevents optimizations that work on VSELECT to perform invalid
transformations. Indeed, the optimized condition does not match the vector
boolean content that is expected and bad things may happen.

This patch yields the exact same code on the whole test-suite + specs (-O3 and
-O3 -march=core-avx2), it improves one test case (vector-blend.ll) and fixes a
bug reduced in vselect-avx.ll.

<rdar://problem/18819506>

llvm-svn: 221429
2014-11-06 02:25:03 +00:00
Matt Arsenault 6e863d12e5 Remove unnecessary .c_str() when implicitly converting to Twine
llvm-svn: 221422
2014-11-06 01:13:27 +00:00
Petar Jovanovic 35f05747f3 [mips64] Fix MIPS64 exception personality encoding
Remove dynamic relocations of __gxx_personality_v0 from the .eh_frame.
The MIPS64 follow-up of the MIPS32 fix (rL209907).

Patch by Vladimir Stefanovic.

Differential Revision: http://reviews.llvm.org/D6141

llvm-svn: 221408
2014-11-05 22:42:31 +00:00
Simon Pilgrim 1fc483d991 [X86][SSE] Vector integer to float conversion memory folding
Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works.

Differential Revision: http://reviews.llvm.org/D5981

llvm-svn: 221407
2014-11-05 22:28:25 +00:00
Michael Ilseman a7202bdbed Fix heap-use-after-free bug in expandSDiv when the operands are
constants, as discovered by ASAN.

Patch by Mehdi Amini!

llvm-svn: 221401
2014-11-05 21:28:24 +00:00
Steven Wu d994b8aaa4 Remove obsolete ARM intrinsics vclz and vcnt
Both of the intrinsics get autoupgraded to target independent
intrinsics.

llvm-svn: 221396
2014-11-05 21:02:55 +00:00
Matt Arsenault f2676a5afc R600/SI: Fix omod display for VOP3b
llvm-svn: 221387
2014-11-05 19:35:00 +00:00
Derek Schuff a54222045e [x86 fast-isel] Materialize allocas with the correct-sized lea for ILP32
Summary:
X86FastISel::fastMaterializeAlloca was incorrectly conditioning its
opcode selection on subtarget bitness rather than pointer size.

Differential Revision: http://reviews.llvm.org/D6136

llvm-svn: 221386
2014-11-05 19:27:21 +00:00
Matt Arsenault f3cd4512ac R600/SI: Move all rsrc building functions to SIISelLowering
llvm-svn: 221383
2014-11-05 19:01:19 +00:00
Matt Arsenault 485defe58c R600/SI: Remove SI_ADDR64_RSRC
llvm-svn: 221382
2014-11-05 19:01:17 +00:00
Justin Holewinski 3d140fcfd1 [NVPTX] Add NVPTXLowerStructArgs pass
This works around the limitation that PTX does not allow .param space
loads/stores with arbitrary pointers.

If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then
add the following instructions to the first basic block :

%temp = alloca %struct.x, align 8
%tt1 = bitcast %struct.x * %d to i8 *
%tt2 = llvm.nvvm.cvt.gen.to.param %tt2
%tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) *
%tv = load %struct.x addrspace(101) * %tempd
store %struct.x %tv, %struct.x * %temp, align 8

The above code allocates some space in the stack and copies the incoming
struct from param space to local space. Then replace all occurences of %d
by %temp.

Fixes PR21465.

llvm-svn: 221377
2014-11-05 18:19:30 +00:00
Duncan P. N. Exon Smith c5754a65e6 IR: MDNode => Value: NamedMDNode::getOperator()
Change `NamedMDNode::getOperator()` from returning `MDNode *` to
returning `Value *`.  To reduce boilerplate at some call sites, add a
`getOperatorAsMDNode()` for named metadata that's expected to only
return `MDNode` -- for now, that's everything, but debug node named
metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change.  This
is part of PR21433.

Note that there's a follow-up patch to clang for the API change.

llvm-svn: 221375
2014-11-05 18:16:03 +00:00
Sanjay Patel 8f093f4138 remove extra breaks; NFC
llvm-svn: 221374
2014-11-05 18:00:07 +00:00
Duncan P. N. Exon Smith 21efe02e59 IR: MDNode => Value: AsmWriter SlotTracker API
Change `SlotTracker::CreateMetadataSlot()` and
`SlotTracker::getMetadataSlot()` to use `Value` instead of `MDNode`.
Part of PR21433.

llvm-svn: 221373
2014-11-05 17:56:28 +00:00
Tilmann Scheller 30c5ca25a5 [ARM] Remove more dead code.
Dead code identified by the Clang static analyzer.

llvm-svn: 221372
2014-11-05 17:45:04 +00:00
Zoran Jovanovic 06c9d55123 ps][microMIPS] Implement CodeGen support for ANDI16 instruction
llvm-svn: 221371
2014-11-05 17:43:00 +00:00
Colin LeMahieu 816ef086f6 [Hexagon] [NFC] Alphabetizing cmake files.
llvm-svn: 221370
2014-11-05 17:38:48 +00:00
Zoran Jovanovic 9f99723d92 ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions
llvm-svn: 221369
2014-11-05 17:38:31 +00:00
Tilmann Scheller c339992338 [ARM] Remove another redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 221368
2014-11-05 17:34:04 +00:00
Zoran Jovanovic 8853171b46 [mips][microMIPS] Implement ANDI16 instruction
llvm-svn: 221367
2014-11-05 17:31:00 +00:00
Tilmann Scheller 219ad28076 [ARM] Remove redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 221366
2014-11-05 17:28:19 +00:00
Peter Collingbourne a1099840ff [dfsan] Abort at runtime on indirect calls to uninstrumented vararg functions.
We currently have no infrastructure to support these correctly.

This is accomplished by generating a call to a runtime library function that
aborts at runtime in place of the regular wrapper for such functions. Direct
calls are rewritten in the usual way during traversal of the caller's IR.

We also remove the "split-stack" attribute from such wrappers, as the code
generator cannot currently handle split-stack vararg functions.

llvm-svn: 221360
2014-11-05 17:21:00 +00:00
Duncan P. N. Exon Smith 9727e7865e IR: MDNode => Value: NamedMDNode::addOperand()
Change `NamedMDNode::addOperand()` to take a `Value *` instead of an
`MDNode *`.  This is part of PR21433.

llvm-svn: 221359
2014-11-05 17:16:09 +00:00
Tilmann Scheller f2572c5097 [ARM] Remove dead code identified by the Clang static analyzer.
llvm-svn: 221358
2014-11-05 17:10:43 +00:00
Zoran Jovanovic 9c654830f7 [mips][microMIPS] Mark symbols as microMIPS if necessary
Differential Revision: http://reviews.llvm.org/D6039

llvm-svn: 221355
2014-11-05 16:35:20 +00:00
Zoran Jovanovic a87308c84c Reverted revisions 221351, 221352 and 221353.
llvm-svn: 221354
2014-11-05 16:19:59 +00:00
Zoran Jovanovic 3038500f3b [mips][microMIPS] Implement CodeGen support for ANDI16 instruction
Differential Revision: http://reviews.llvm.org/D5797

llvm-svn: 221353
2014-11-05 15:54:05 +00:00
Zoran Jovanovic f4f5f1e272 [mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions
Differential Revision: http://reviews.llvm.org/D5933

llvm-svn: 221352
2014-11-05 15:46:53 +00:00
Zoran Jovanovic e548bb0634 [mips][microMIPS] Implement ANDI16 instruction
Differential Revision: http://reviews.llvm.org/D5163

llvm-svn: 221351
2014-11-05 15:39:41 +00:00
Tom Stellard 326d6ece94 R600/SI: Change all instruction assembly names to lowercase.
This matches the format produced by the AMD proprietary driver.

//==================================================================//
// Shell script for converting .ll test cases: (Pass the .ll files
   you want to convert to this script as arguments).
//==================================================================//

; This was necessary on my system so that A-Z in sed would match only
; upper case.  I'm not sure why.
export LC_ALL='C'

TEST_FILES="$*"

MATCHES=`grep -v Patterns SIInstructions.td | grep -o '"[A-Z0-9_]\+["e]' | grep -o '[A-Z0-9_]\+' | sort -r`

for f in $TEST_FILES; do
  # Check that there are SI tests:
  grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f
  if [ $? -eq 0 ]; then
    for match in $MATCHES; do
      sed -i -e "s/\([ :]$match\)/\L\1/" $f
    done

    # Try to get check lines with partial instruction names
    sed -i 's/\(;[ ]*SI[A-Z\\-]*: \)\([A-Z_0-9]\+\)/\1\L\2/' $f
  fi
done

sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll
sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.*32.ll ../../../test/CodeGen/R600/sext-in-reg.ll
sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll
sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll
sed -i 's/\(; CHECK[-NOT]*: \)\([A-Z_0-9]\+\)/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll

//==================================================================//
// Shell script for converting .td files (run this last)
//==================================================================//

export LC_ALL='C'
sed -i -e '/Patterns/!s/\("[A-Z0-9_]\+[ "e]\)/\L\1/g' SIInstructions.td
sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td

llvm-svn: 221350
2014-11-05 14:50:53 +00:00
Andrea Di Biagio ce46b97b48 [X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks.
This patch improves the folding of vector AND nodes into blend operations for
targets that feature SSE4.1. A vector AND node where one of the operands is
a constant build_vector with elements that are either zero or all-ones can be
converted into a blend.

This allows for example to simplify the following code:

define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
  %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1>
  %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0>
  %3 = or <4 x i32> %1, %2
  ret <4 x i32> %3
}

Before this patch llc (-mcpu=corei7) generated:
        andps  LCPI1_0(%rip), %xmm0, %xmm0
        andps  LCPI1_1(%rip), %xmm1, %xmm1
        orps   %xmm1, %xmm0, %xmm0
        retq

With this patch we generate a single 'vpblendw'.

llvm-svn: 221343
2014-11-05 13:04:14 +00:00
Oliver Stannard 9e89d8cc5c [ARM] Honor FeatureD16 in the assembler and disassembler
Some ARM FPUs only have 16 double-precision registers, rather than the
normal 32. LLVM represents this with the D16 target feature. This is
currently used by CodeGen to avoid using high registers when they are
not available, but the assembler and disassembler do not.

I fix this in the assmebler and disassembler rather than the
InstrInfo.td files, as the latter would require a large number of
changes everywhere one of the floating-point instructions is referenced
in the backend. This solution is similar to the one used for
co-processor numbers and MSR masks.

llvm-svn: 221341
2014-11-05 12:06:39 +00:00
Craig Topper 12f0d9ef2c Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves.
llvm-svn: 221336
2014-11-05 06:43:02 +00:00
David Majnemer 5026722287 llvm-readobj: Add support for dumping the DOS header in PE files
llvm-svn: 221333
2014-11-05 06:24:35 +00:00
Jiangning Liu 1fb71bc395 Revert 220932.
Commit 220932 caused crash when building clang-tblgen on aarch64 debian target,
so it's blocking all daily tests.

The std::call_once implementation in pthread has bug for aarch64 debian.

llvm-svn: 221331
2014-11-05 04:44:31 +00:00
Duncan P. N. Exon Smith b28deb1967 IR: Metadata: Remove unnecessary dyn_cast
llvm-svn: 221328
2014-11-05 01:55:06 +00:00
David Majnemer bf7550e7ec InstSimplify: Exact shifts of X by Y are X if X has the lsb set
Exact shifts may not shift out any non-zero bits. Use computeKnownBits
to determine when this occurs and just return the left hand side.

This fixes PR21477.

llvm-svn: 221325
2014-11-05 00:59:59 +00:00
Tim Northover dc0d9e46a5 ARM: try to add extra CS-register whenever stack alignment >= 8.
We currently try to push an even number of registers to preserve 8-byte
alignment during a function's prologue, but only when the stack alignment is
prcisely 8. Many of the reasons for doing it apply also when that alignment > 8
(the extra store is often free, and can save another stack adjustment, though
less frequently for 16-byte stack alignment).

llvm-svn: 221321
2014-11-05 00:27:20 +00:00
Tim Northover 228c943f31 ARM/Dwarf: correctly align stack before callee-saved VPRs
We were making an attempt to do this by adding an extra callee-saved GPR (so
that there was an even number in the list), but when that failed we went ahead
and pushed anyway.

This had a couple of potential issues:
  + The .cfi directives we emit misplaced dN because they were based on
    PrologEpilogInserter's calculation.
  + Unaligned stores can be less efficient.
  + Unaligned stores can actually fault (likely only an issue in niche cases,
    but possible).

This adds a final explicit stack adjustment if all other options fail, so that
the actual locations of the registers match up with where they should be.

llvm-svn: 221320
2014-11-05 00:27:13 +00:00
David Majnemer f20d7c4c61 Analysis: Make isSafeToSpeculativelyExecute fire less for divides
Divides and remainder operations do not behave like other operations
when they are given poison: they turn into undefined behavior.

It's really hard to know if the operands going into a div are or are not
poison.  Because of this, we should only choose to speculate if there
are constant operands which we can easily reason about.

This fixes PR21412.

llvm-svn: 221318
2014-11-04 23:49:08 +00:00
Reid Kleckner 941e93e9a8 Revert "[Reassociate] Canonicalize negative constants out of expressions."
This reverts commit r221171.

It performs this invalid transformation:
-  %div.i = urem i64 -1, %add
-  %sub.i = sub i64 -2, %div.i
+  %div.i = urem i64 1, %add
+  %sub.i1 = add i64 %div.i, -2

llvm-svn: 221317
2014-11-04 23:42:45 +00:00
Simon Pilgrim c9a0779309 [X86][SSE] Enable commutation for SSE immediate blend instructions
Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask.

This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests).

Differential Revision: http://reviews.llvm.org/D6015

llvm-svn: 221313
2014-11-04 23:25:08 +00:00
Mark Heffernan 2d393ea6ef Revert earlier change removing setPreservesCFG from instcombine (r221223) and
change LoopSimplifyPass to be !isCFGOnly.  The motivation for the earlier patch
(r221223) was that LoopSimplify is not preserved by instcombine though
setPreservesCFG indicates that it is.  This change fixes the issue
by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less
invasive.

llvm-svn: 221311
2014-11-04 23:02:09 +00:00
Juergen Ributzka f9660f0712 [AArch64] Use the correct register class for ORR.
While fixing up the register classes in the machine combiner in a previous
commit I missed one.

This fixes the last one and adds a test case.

llvm-svn: 221308
2014-11-04 22:20:07 +00:00
Rafael Espindola d85260827c Revert "[mips] Add names and tests for the hardware registers"
This reverts commit r221299.

The tests

    LLVM :: MC/Disassembler/Mips/mips32.txt
    LLVM :: MC/Disassembler/Mips/mips32_le.txt

were failing.

llvm-svn: 221307
2014-11-04 22:15:05 +00:00
David Blaikie 3a443c29b9 Provide gmlt-like inline scope information in the skeleton CU to facilitate symbolication without needing the .dwo files
Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2,
binary increases by 25%.

A large binary inside Google, split-dwarf, -O0, and other internal flags
(GDB index, etc) increases by 1.8%, optimized build is 35%.

The size impact may be somewhat greater in .o files (I haven't measured
that much - since the linked executable -O0 numbers seemed low enough)
due to relocations. These relocations could be removed if we taught the
llvm-symbolizer to handle indexed addressing in the .o file (GDB can't
cope with this just yet, but GDB won't be reading this info anyway).
Also debug_ranges could be shared between .o and .dwo, though ideally
debug_ranges would get a schema that could used index(+offset)
addressing, and move to the .dwo file, then we'd be back to sharing
addresses in the address pool again.

But for now, these sizes seem small enough to go ahead with this.

Verified that no other DW_TAGs are produced into the .o file other than
subprograms and inlined_subroutines.

llvm-svn: 221306
2014-11-04 22:12:25 +00:00
David Blaikie 9bfd7a9f43 Move cross-unit DIE caching to the DwarfFile level, so it doesn't interfere with fission-gmlt data and produce skeleton<>full unit cross referencing.
llvm-svn: 221305
2014-11-04 22:12:18 +00:00
Rafael Espindola bfd0f01dd7 Don't produce relocations for a difference in a section with no symbols.
We were producing a relocation for
----------------
.section foo,bar
La:
Lb:
 .long   La-Lb
--------------

but not for

---------------------
  .section foo,bar
zed:
La:
Lb:
 .long   La-Lb
----------------

This patch handles the case where both fragments are part of the first atom
in a section and there is no corresponding symbol to that atom.

This fixes pr21328.

llvm-svn: 221304
2014-11-04 22:10:33 +00:00
Vasileios Kalintiris a16974a5c0 [mips] Move COP2 & COP3 load/store instructions from MipsInstrFPU.td to MipsInstrInfo.td. NFC.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5843

llvm-svn: 221300
2014-11-04 21:45:16 +00:00
Vasileios Kalintiris df6e0d0371 [mips] Add names and tests for the hardware registers
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5763

llvm-svn: 221299
2014-11-04 21:30:44 +00:00
Andrea Di Biagio f5b34e535d [X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX and FeatureSSE4A for all the bdver* cpus.
This patch adds 'FeatureSlowSHLD' to 'bdver3'.
According to the official AMD optimization guide for amdfam15: "Using
alternative code in place of SHLD achieves lower overall latency and
requires fewer execution resources. The 32-bit and 64-bit forms of
ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
instructions, while SHLD is a VectorPath instruction."

This patch also explicitly sets feature AVX and SSE4A for all the bdver*
cpus. This part of the patch is a non-functional change and it is mainly
done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A).

llvm-svn: 221296
2014-11-04 21:18:09 +00:00
Arnaud A. de Grandmaison a11cab3120 [PBQP] Callee saved regs should have a higher cost than scratch regs
Registers are not all equal. Some are not allocatable (infinite cost),
some have to be preserved but can be used, and some others are just free
to use.

Ensure there is a cost hierarchy reflecting this fact, so that the
allocator will favor scratch registers over callee-saved registers.

llvm-svn: 221293
2014-11-04 20:51:29 +00:00
Arnaud A. de Grandmaison 829dd81377 [PBQP] Tweak spill costs and coalescing benefits
This patch improves how the different costs (register, interference, spill
and coalescing) relates together. The assumption is now that:
 - coalescing (or any other "side effect" of reg alloc) is negative, and
   instead of being derived from a spill cost, they use the block
   frequency info.
 - spill costs are in the [MinSpillCost:+inf( range
 - register or interference costs are in [0.0:MinSpillCost( or +inf

The current MinSpillCost is set to 10.0, which is a random value high
enough that the current constraint builders do not need to worry about
when settings costs. It would however be worth adding a normalization
step for register and interference costs as the last step in the
constraint builder chain to ensure they are not greater than SpillMinCost
(unless this has some sense for some architectures). This would work well
with the current builder pipeline, where all costs are tweaked relatively
to each others, but could grow above MinSpillCost if the pipeline is
deep enough.

The current heuristic is tuned to depend rather on the number of uses of
a live interval rather than a density of uses, as used by the greedy
allocator. This heuristic provides a few percent improvement on a number
of benchmarks (eembc, spec, ...) and will definitely need to change once
spill placement is implemented: the current spill placement is really
ineficient, so making the cost proportionnal to the number of use is a
clear win.

llvm-svn: 221292
2014-11-04 20:51:24 +00:00
Matt Arsenault a95f5a0ec1 R600/SI: Rename div_scale dest operands to match documentation
llvm-svn: 221291
2014-11-04 20:29:20 +00:00
Benjamin Kramer 185dc0da1f AArch64: Pattern match integer vector abs like we do on ARM.
This kind of pattern is emitted by the loop vectorizer.

llvm-svn: 221289
2014-11-04 20:10:06 +00:00
Kostya Serebryany c5bd9810cc [asan] [mips] changed ShadowOffset32 for systems having 16kb PageSize; patch by Kumar Sukhani
llvm-svn: 221288
2014-11-04 19:46:15 +00:00
David Majnemer 2de97fcd9a InstSimplify: Fold a hasNoSignedWrap() call into a match() expression
No functionality change intended, it's just a little more concise.

llvm-svn: 221281
2014-11-04 17:47:13 +00:00
David Majnemer 4f438377fb InstSimplify: Fold a hasNoUnsignedWrap() call into a match() expression
No functionality change intended, it's just a little more concise.

llvm-svn: 221280
2014-11-04 17:38:50 +00:00
Toma Tabacu cc2502d8f3 [mips] Improve support for the .set mips16/nomips16 assembler directives.
Summary:
Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted).

These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled).

Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5462

llvm-svn: 221277
2014-11-04 17:18:07 +00:00
Sanjay Patel aee8421088 remove function names from comments; NFC
llvm-svn: 221274
2014-11-04 16:27:42 +00:00
Sanjay Patel 547e9752ff fix typo in comment
llvm-svn: 221273
2014-11-04 16:09:50 +00:00
Simon Atanasyan d2bfd00e71 [yaml2obj] Allow yaml2obj tool to recognize EF_MIPS_NAN2008 flag
llvm-svn: 221268
2014-11-04 13:33:36 +00:00
Rafael Espindola c1f30877e0 Remove FindProgramByName. NFC.
llvm-svn: 221258
2014-11-04 12:35:47 +00:00
Yaron Keren ec69a4ece1 Fix Visual C++ warning, Program.inc(85): warning C4018: '<' : signed/unsigned mismatch.
llvm-svn: 221252
2014-11-04 09:22:41 +00:00
NAKAMURA Takumi 72e626e305 sys::findProgramByName(): [Win32] Tweak to pass lowercase .exe to SearchPath() to appease clang Driver's tests.
It seems SearchPath() doesn't show actual extension on the filesystem.

FIXME: Shall we use FindFirstFile() here?
llvm-svn: 221246
2014-11-04 08:17:15 +00:00
David Majnemer b925715c56 CodeGen: Enable DWARF emission for MS ABI targets
This is experimental, just barely enough to get things to not
immediately combust.

A note for those who are curious:
Only lld can successfully link the object files, other linkers truncate
the section names making the debug sections illegible to debuggers.

Even with this in mind, we believe we are having trouble with SECREL
relocations.

llvm-svn: 221245
2014-11-04 08:03:31 +00:00
Yaron Keren 6091fe7db9 #include <winbase.h> is not enough for Visual C++ 2013, it errors:
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C2146: syntax error : missing ';' before identifier 'nLength'
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
...

including <windows.h> is actually required.

llvm-svn: 221244
2014-11-04 07:53:30 +00:00
NAKAMURA Takumi 8d8f396d86 R600/LLVMBuild.txt: Add TransformUtils.
llvm-svn: 221228
2014-11-04 02:16:53 +00:00
Reid Kleckner dd3f3edafa Revert "Transforms: reapply SVN r219899"
This reverts commit r220811 and r220839. It made an incorrect change to
musttail handling.

llvm-svn: 221226
2014-11-04 02:02:14 +00:00
Mark Heffernan 2e25042a93 Remove setPreservesCFG from instcombine. The pass, in particular, does not
preserve LoopSimplify because instcombine may replace branch predicates
with undef which loop simplify then replaces with always exit.  Replace
setPreservesCFG with the more constrained preservation of DomTree and
LoopInfo.

llvm-svn: 221223
2014-11-04 01:51:01 +00:00
Michael J. Spencer f9074b5a91 Use findProgramByName.
llvm-svn: 221221
2014-11-04 01:29:59 +00:00
Michael J. Spencer 65ffd92f07 [Support][Program] Add findProgramByName(Name, OptionalPaths)
llvm-svn: 221220
2014-11-04 01:29:29 +00:00
Kevin Enderby 72cdbf47a9 Remove the static version of getScatteredRelocationType() now that r221211 added
a public version MachOObjectFile::getScatteredRelocationType().

This should fix the build bot for the unused function error.

llvm-svn: 221216
2014-11-04 01:12:39 +00:00
Sanjoy Das e839965faa The patchpoint lowering logic would crash with live constants equal to
the tombstone or empty keys of a DenseMap<int64_t, T>.  This patch
fixes the issue (and adds a tests case).

llvm-svn: 221214
2014-11-04 00:59:21 +00:00
Kevin Enderby 9907d0a3c2 Add the code and test cases for 32-bit Intel to llvm-objdump’s Mach-O symbolizer.
llvm-svn: 221211
2014-11-04 00:43:16 +00:00
Colin LeMahieu 5241881bbc [Hexagon] Reverting 220584 to address ASAN errors.
llvm-svn: 221210
2014-11-04 00:14:36 +00:00
Sanjoy Das 429c9cae09 Change logic in StackMaps::recordStackMapOpers to use the isInt<32>
predicate instead of bitwise operations.

This is not a functional change.

llvm-svn: 221209
2014-11-04 00:06:57 +00:00
Akira Hatanaka bc950d52d7 Rename variables to conform to llvm coding standards.
Differential Revision: http://reviews.llvm.org/D6062

llvm-svn: 221204
2014-11-03 23:24:10 +00:00
Hal Finkel 840257a49c Use AA in LoadCombine
LoadCombine can be smarter about aborting when a writing instruction is
encountered, instead of aborting upon encountering any writing instruction, use
an AliasSetTracker, and only abort when encountering some write that might
alias with the loads that could potentially be combined.

This was originally motivated by comments made (and a test case provided) by
David Majnemer in response to PR21448. It turned out that LoadCombine was not
responsible for that PR, but LoadCombine should also be improved so that
unrelated stores (and @llvm.assume) don't interrupt load combining.

llvm-svn: 221203
2014-11-03 23:19:16 +00:00
David Blaikie 5b02a19f90 Use common range handling for the CU's ranges
This generalizes the range handling for ranges in both the skeleton and
full unit, laying the foundation for the addition of more ranges (rather
than just the CU's special case) in the skeleton CU with fission+gmlt.

llvm-svn: 221202
2014-11-03 23:10:59 +00:00
Akira Hatanaka 9ee2c26b49 [AArch64] Make function processLogicalImmediate more efficient. NFC.
llvm-svn: 221199
2014-11-03 23:06:31 +00:00
David Majnemer 7e2b9882b1 InstCombine: Remove infinite loop caused by FoldOpIntoPhi
FoldOpIntoPhi could create an infinite loop if the PHI could potentially
reach a BB it was considering inserting instructions into.  The
instructions it would insert would eventually lead to other combines
firing which would, again, lead to FoldOpIntoPhi firing.

The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to
insert instructions that the PHI might reach.

This fixes PR21377.

llvm-svn: 221187
2014-11-03 21:55:12 +00:00
David Blaikie 542616d47c Push the CURangeList down into the skeleton CU (where available) rather than the full CU
So that it may be shared between skeleton/full compile unit, for CU
ranges and other ranges to be added for fission+gmlt.

(at some point we might want some kind of object shared between the
skeleton and full compile units for all those things we only want one of
in that scope, rather than having the full unit always look through to
the skeleton... - alternatively, we might be able to have the skeleton
pointer (or another, separate pointer) point to the skeleton or to the
unit itself in non-fission, so we don't have to special case its
absence)

llvm-svn: 221186
2014-11-03 21:52:56 +00:00
Ahmed Bougacha ec0f3d755f [X86] Add debug print name for X86ISD::[US]MUL8. NFC-ish.
The opcodes were added in r220516, but I forgot to add the print names.

llvm-svn: 221185
2014-11-03 21:25:18 +00:00
David Blaikie ce343492ee Add DwarfCompileUnit::BaseAddress to track the base address used by relative addressing in debug_ranges and debug_loc
This is one of a few steps to generalize range handling to include the
CU range (thus the CU's range list will be moved into the range list
list, losing track of the base address in the process), which means
generalizing ranges from both the skeleton and full unit under fission.

And... then I can used that generalized support for ranges in
fission+gmlt where there'll be a bunch more ranges in the skeleton.

llvm-svn: 221182
2014-11-03 21:15:30 +00:00
Akira Hatanaka b961534818 [ARM, inline-asm] Fix ARMTargetLowering::getRegForInlineAsmConstraint to return
register class tGPRRegClass if the target is thumb1.

This commit fixes a crash that occurs during register allocation which was
triggered when a virtual register defined by an inline-asm instruction had to
be spilled.
 
rdar://problem/18740489

llvm-svn: 221178
2014-11-03 20:37:04 +00:00
Ahmed Bougacha 12eb558bd9 [X86] 8bit divrem: Improve codegen for AH register extraction.
For 8-bit divrems where the remainder is used, we used to generate:
    divb  %sil
    shrw  $8, %ax
    movzbl  %al, %eax

That was to avoid an H-reg access, which is problematic mainly because
it isn't possible in REX-prefixed instructions.

This patch optimizes that to:
    divb  %sil
    movzbl  %ah, %eax

To do that, we explicitly extend AH, and extract the L-subreg in the
resulting register.  The extension is done using the NOREX variants of
MOVZX.  To support signed operations, MOVSX_NOREX is also added.
Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is
then lowered to a sequence containing a single zext (rather than 2).

Differential Revision: http://reviews.llvm.org/D6064

llvm-svn: 221176
2014-11-03 20:26:35 +00:00
Hal Finkel 1e16fa302e EarlyCSE should ignore calls to @llvm.assume
EarlyCSE uses a simple generation scheme for handling memory-based
dependencies, and calls to @llvm.assume (which are marked as writing to memory
to ensure the preservation of control dependencies) disturb that scheme
unnecessarily. Skipping calls to @llvm.assume is legal, and the alternative
(adding AA calls in EarlyCSE) is likely undesirable (we have GVN for that).

Fixes PR21448.

llvm-svn: 221175
2014-11-03 20:21:32 +00:00
Tom Stellard 5cbb53c41e Reapply: R600: Make sure to inline all internal functions
Function calls aren't supported yet.

This was reverted due to build breakages, which should be fixed now.

llvm-svn: 221173
2014-11-03 19:49:05 +00:00
Chad Rosier 005505b027 [Reassociate] Canonicalize negative constants out of expressions.
This gives CSE/GVN more options to eliminate duplicate expressions.
This is a follow up patch to http://reviews.llvm.org/D4904.

http://reviews.llvm.org/D5363

llvm-svn: 221171
2014-11-03 19:11:30 +00:00
Paul Robinson ad06e430ce Normally an 'optnone' function goes through fast-isel, which does not
call DAGCombiner. But we ran into a case (on Windows) where the
calling convention causes argument lowering to bail out of fast-isel,
and we end up in CodeGenAndEmitDAG() which does run DAGCombiner.
So, we need to make DAGCombiner check for 'optnone' after all.

Commit includes the test that found this, plus another one that got
missed in the original optnone work.

llvm-svn: 221168
2014-11-03 18:19:26 +00:00
Duncan P. N. Exon Smith 3d5a02f677 IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()
Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of
`MDNode` to one of `Value`.  Part of PR21433.

llvm-svn: 221167
2014-11-03 18:13:57 +00:00
Charlie Turner 1d8cc909cc Remove the cortex-a9-mp CPU.
This CPU definition is redundant. The Cortex-A9 is defined as
supporting multiprocessing extensions. Remove its definition and
update appropriate tests.

LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only
difference between the two CPU definitions in ARM.td is that
cortex-a9-mp contains the feature FeatureMP for multiprocessing
extensions.

This is redundant since the Cortex-A9 is defined as having
multiprocessing extensions in the TRMs. armcc also defines the
Cortex-A9 as having multiprocessing extensions by default.

Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7
llvm-svn: 221166
2014-11-03 17:38:00 +00:00
David Blaikie 077ad48447 Cleanup some unused or trivial functions in DwarfCompileUnit
llvm-svn: 221164
2014-11-03 17:10:38 +00:00
David Blaikie bc532b44a0 Sink DwarfUnit::CURanges into DwarfCompileUnit
llvm-svn: 221161
2014-11-03 16:40:43 +00:00
Oliver Stannard 269a275cb4 [AArch64] Fix miscompile of comparison with 0xffffffffffffffff
Some literals in the AArch64 backend had 15 'f's rather than 16, causing
comparisons with a constant 0xffffffffffffffff to be miscompiled.

llvm-svn: 221157
2014-11-03 15:28:40 +00:00
Sid Manning 326f8af463 Handle ctor/init_array initialization.
Hexagon was not calling InitializeELF and could not select between
ctors and init_array.

Phabricator revision: http://reviews.llvm.org/D6061

llvm-svn: 221156
2014-11-03 14:56:05 +00:00
Rafael Espindola 42bce8f69d Add CRLF support to LineIterator.
The MRI scripts have to work with CRLF, and in general it is probably
a good idea to support this in a core utility like LineIterator.

llvm-svn: 221153
2014-11-03 14:09:47 +00:00
Oliver Stannard cf6bfb1dd0 Revert r221150, as it broke sanitizer tests
llvm-svn: 221151
2014-11-03 12:19:03 +00:00
Oliver Stannard 652ec6ee89 Emit .eh_frame with relocations to functions, rather than sections
When LLVM emits DWARF call frame information, it currently creates a local,
section-relative symbol in the code section, which is pointed to by a
relocation on the .eh_frame section. However, for C++ we emit some functions in
section groups, and the SysV ABI has some rules to make it easier to remove
these sections
(http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules):

  A symbol table entry with STB_LOCAL binding that is defined relative to one
  of a group's sections, and that is contained in a symbol table section that is
  not part of the group, must be discarded if the group members are discarded.
  References to this symbol table entry from outside the group are not allowed.

This means that we need to use the function symbol for the relocation, not a
temporary symbol.

There was a comment in the code claiming that the local symbol was used to
avoid creating a relocation, but a relocation must be created anyway as the
code and CFI are in different sections.

llvm-svn: 221150
2014-11-03 12:02:51 +00:00
Peter Collingbourne 094d061659 CMake: Add libm to list of system libs printed by llvm-config.
This is required by the interpreter library, and also matches the autoconf
behavior.

llvm-svn: 221147
2014-11-03 10:38:26 +00:00
Daniel Sanders 0ad1719d15 [mips] Remove unused prototype and variable. NFC.
llvm-svn: 221146
2014-11-03 10:14:57 +00:00
David Majnemer 72a643dc8f InstCombine: Combine (X | Y) - X to (~X & Y)
This implements the transformation from (X | Y) - X to (~X & Y).

Differential Revision: http://reviews.llvm.org/D5791

llvm-svn: 221129
2014-11-03 05:53:55 +00:00
David Blaikie 89a26f012a Sink range list handling down from DwarfUnit into its only use, in DwarfCompileUnit.
llvm-svn: 221123
2014-11-03 02:41:49 +00:00
Diego Novillo fcd556074c Use ErrorOr for the ::create factory on instrumented and sample profilers.
Summary:
As discussed in
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141027/242445.html,
the creation of reader and writer instances is better done using
ErrorOr. There are no functional changes, but several callers needed to
be adjusted.

Reviewers: bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6076

llvm-svn: 221120
2014-11-03 00:51:45 +00:00
Matt Arsenault 4cd1d4ecb1 R600: Don't unnecessarily repeat the register class
llvm-svn: 221119
2014-11-02 23:46:59 +00:00
Matt Arsenault 7d858d87cd R600/SI: Use REG_SEQUENCE instead of INSERT_SUBREGs
llvm-svn: 221118
2014-11-02 23:46:54 +00:00
Matt Arsenault eb49216bba Support REG_SEQUENCE in tablegen.
The problem is mostly that variadic output instruction
aren't handled, so it is rejected for having an inconsistent
number of operands, and then the right number of operands
isn't emitted.

llvm-svn: 221117
2014-11-02 23:46:51 +00:00
Daniel Sanders 23e987766b Re-commit r221056 and others with fix, "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
sret arguments can never originate from an f128 argument so we detect
sret arguments and push false into OriginalArgWasF128.

llvm-svn: 221102
2014-11-02 16:09:29 +00:00
Rafael Espindola 778fcc770b Revert r221096 bringing back r221014 with a fix.
The issue was that linkAppendingVarProto does the full linking job, including
deleting the old dst variable. The fix is just to call it and return early
if we have a GV with appending linkage.

original message:

    Refactor duplicated code in liking GlobalValues.

    There is quiet a bit of logic that is common to any GlobalValue but was
    duplicated for Functions, GlobalVariables and GlobalAliases.

    While at it, merge visibility even when comdats are used, fixing pr21415.

llvm-svn: 221098
2014-11-02 13:28:57 +00:00
Chandler Carruth fd38af2d13 Revert r221014: "Refactor duplicated code in liking GlobalValues."
This commit introduces heap-use-after-free detected by ASan. Here is the output
for one of several tests that detect it:

******************** TEST 'LLVM :: Linker/AppendingLinkage.ll' FAILED ********************
Command Output (stderr):
--
=================================================================
==2122==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c00000b9c8 at pc 0x0000005d05d1 bp 0x7fff64ed27c0 sp 0x7fff64ed27b8
READ of size 4 at 0x60c00000b9c8 thread T0
    #0 0x5d05d0 in llvm::GlobalValue::setUnnamedAddr(bool) /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115:35
    #1 0x69fff1 in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1041:5
    #2 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #3 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #4 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #5 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #6 0x41eb71 in _start (/usr/local/google/home/chandlerc/src/llvm/build/bin/llvm-link+0x41eb71)

0x60c00000b9c8 is located 72 bytes inside of 128-byte region [0x60c00000b980,0x60c00000ba00)
freed by thread T0 here:
    #0 0x4a1e6b in operator delete(void*) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:94:3
    #1 0x5d1a7a in llvm::iplist<llvm::GlobalVariable, llvm::ilist_traits<llvm::GlobalVariable> >::erase(llvm::ilist_iterator<llvm::GlobalVariable>) /usr/local/google/home/chandlerc/src/llvm/build/../inclu
de/llvm/ADT/ilist.h:466:5
    #2 0x5d1980 in llvm::GlobalVariable::eraseFromParent() /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/Globals.cpp:204:3
    #3 0x6a8a4d in (anonymous namespace)::ModuleLinker::linkAppendingVarProto(llvm::GlobalVariable*, llvm::GlobalVariable const*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.
cpp:980:3
    #4 0x6a7403 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1074:11
    #5 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
    #6 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #7 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #8 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #9 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287

previously allocated by thread T0 here:
    #0 0x4a192b in operator new(unsigned long) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:62:35
    #1 0x61d85c in llvm::User::operator new(unsigned long, unsigned int) /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/User.cpp:57:19
    #2 0x6a7525 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1100:3
    #3 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
    #4 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #5 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #6 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #7 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287

SUMMARY: AddressSanitizer: heap-use-after-free /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115 llvm::GlobalValue::setUnnamedAddr(bool)
Shadow bytes around the buggy address:
  0x0c187fff96e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff96f0: 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa
  0x0c187fff9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c187fff9710: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff9720: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>0x0c187fff9730: fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd
  0x0c187fff9740: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff9750: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
  0x0c187fff9760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c187fff9770: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  ASan internal:           fe
==2122==ABORTING

llvm-svn: 221096
2014-11-02 09:10:31 +00:00
David Blaikie 4aa49b2054 Formatting
llvm-svn: 221095
2014-11-02 08:52:37 +00:00
David Blaikie cafd962d97 Add DwarfUnit::isDwoUnit and use it to generalize string creation
Currently we only need to emit skeleton strings into the CU header and
we do this by explicitly calling "addLocalString". With gmlt-in-fission,
we'll be emitting a bunch of other strings from other codepaths where
it's not statically known that these strings will be local or not.

Introduce a virtual function to indicate whether this unit is a DWO unit
or not (I'm not sure if we have a good term for this, the
opposite/alternative to 'skeleton' unit) and use that to generalize the
string emission logic so that strings can be correctly emitted in both
the skeleton and dwo unit when in split dwarf mode.

And to demonstrate that this works, switch the existing special callers
of addLocalString in the skeleton builder to addString - and they still
work. Yay.

llvm-svn: 221094
2014-11-02 08:51:37 +00:00
David Blaikie 279c451c0b Remove the last mention of LineTablesOnly from DwarfUnit, sinking it into DwarfCompileUnit
This is a useful distinction/invariant/delination to make because
LineTablesOnly mode is never relevant to type units, so it's clear that
we're not doing weird line-tables-only-with-types by making this API
choice.

It also lays the foundations nicely for adding gmlt-like data to fission
skeleton CUs while limiting the effects to CUs and not TUs.

llvm-svn: 221093
2014-11-02 08:18:06 +00:00
David Blaikie 3363a57c8e Sink DwarfUnit::applySubprogramAttributesToDefinition into DwarfCompileUnit
llvm-svn: 221092
2014-11-02 08:09:09 +00:00
Elena Demikhovsky 27152aea88 Use Alias Analysis to hoist 2 loads from diamond to the common predecessor basic block.
Alias Analysis allows to detect real barriers for load hoisting.

Review in http://reviews.llvm.org/D5991

llvm-svn: 221091
2014-11-02 08:03:05 +00:00
David Blaikie 978020807a Sink DwarfUnit::addExpr into DwarfCompileUnit
llvm-svn: 221090
2014-11-02 07:11:55 +00:00
David Blaikie 8c485b5d74 Fix the build from the last commit
llvm-svn: 221089
2014-11-02 07:08:12 +00:00
David Blaikie 02a6333ba7 Sink DwarfUnit::applyVariableAttributes into DwarfCompileUnit
llvm-svn: 221088
2014-11-02 07:06:51 +00:00
David Blaikie 4bc0881ac7 Sink DwarfUnit::addLocationList down into DwarfCompileUnit
llvm-svn: 221087
2014-11-02 07:03:19 +00:00
David Blaikie 77895fb276 Sink DwarfUnit::addComplexAddress down into DwarfCompileUnit
llvm-svn: 221086
2014-11-02 06:58:44 +00:00
David Blaikie f7435ee6ce Push DwarfUnit::addAddress down into DwarfCompileUnit
llvm-svn: 221085
2014-11-02 06:46:40 +00:00
David Blaikie 7d48be2b7b Sink DwarfUnit::addVariableAddress into DwarfCompileUnit since type units don't have variables
llvm-svn: 221084
2014-11-02 06:37:23 +00:00
David Blaikie 192b45c1ef DebugInfo: Sink accelerator table lists down (GlobalNames/Types) into DwarfCompileUnit
llvm-svn: 221083
2014-11-02 06:16:39 +00:00
David Blaikie 98cf172175 Add DwarfUnit::addGlobalType to match DwarfUnit::addGlobalName
(these will shortly become virtual, with a null implementation in
DwarfUnit (since type units don't have accelerator tables in the current
schema) and the current implementation down in DwarfCompileUnit, moving
the actual maps there too)

llvm-svn: 221082
2014-11-02 06:06:14 +00:00
NAKAMURA Takumi cd2996c3e3 Revert r221056 and others, "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
  r221058 "[mips] Fix unused variable warning introduced in r221056"
  r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC."
  r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC."

It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll.

llvm-svn: 221081
2014-11-02 04:43:54 +00:00
David Blaikie 871c2d9d63 DebugInfo: Refactor index type DIE initialization by rolling it into the accessor
llvm-svn: 221080
2014-11-02 03:09:13 +00:00
David Blaikie ce47366150 Be sure to initialize DwarfCompileUnit::LabelBegin now that it may be skipped in initSection
llvm-svn: 221079
2014-11-02 02:40:26 +00:00
David Blaikie b6726a9ece Don't bother creating LabelBegin for .dwo units
This would help catch cases where we might otherwise try to reference a
dwo CU label, which would be weird - because without relocations in the
dwo file it's not generally meaningful to talk about the CU offsets
there (or, if it is, we can do so in absolute terms without using a
relocation to compute it).

llvm-svn: 221078
2014-11-02 02:26:24 +00:00
David Blaikie 27e35f2302 Drop DwarfCompileUnit::getLocalLabel* in favor of just mapping through the skeleton explicitly.
Confusing to do this two different ways - I'm not too wedded to either
one, but here goes.

llvm-svn: 221076
2014-11-02 01:21:43 +00:00
David Blaikie f4bdc31271 Sink DwarfUnit::LabelBegin down into DwarfCompileUnit since that's the only place it's needed.
llvm-svn: 221075
2014-11-02 01:21:40 +00:00
David Blaikie ae57e66e36 Sink dwarf unit length emission down into DwarfUnit::emitHeader
This allows the CU label to be emitted only for compile units, as
they're the only ones that need it (so they can be referenced from
pubnames)

llvm-svn: 221072
2014-11-01 23:59:23 +00:00
David Majnemer 634ca236dc InstCombine: Don't assume that m_ZExt matches an Instruction
m_ZExt might bind against a ConstantExpr instead of an Instruction.
Assuming this, using cast<Instruction>, results in InstCombine crashing.

Instead, introduce ZExtOperator to bridge both Instruction and
ConstantExpr ZExts.

This fixes PR21445.

llvm-svn: 221069
2014-11-01 23:46:05 +00:00
David Blaikie 983bfea0d0 Remove DwarfUnit::LabelEnd in favor of computing the length of the section directly
This was a compile-unit specific label (unused in type units) and seems
unnecessary anyway when we can more easily directly compute the size of
the compile unit.

llvm-svn: 221067
2014-11-01 23:07:14 +00:00
David Blaikie a34568b220 Sink DwarfUnit::SectionSym into DwarfCompileUnit as it's only needed/used there.
llvm-svn: 221062
2014-11-01 20:06:28 +00:00
Daniel Sanders 8104b75c9f Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC.
Reviewers: rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D5978

llvm-svn: 221061
2014-11-01 19:32:23 +00:00
David Blaikie a5437b6bf6 Make DwarfCompileUnit::Skeleton more narrowly typed (DwarfCompileUnit* instead of DwarfUnit*) now that it's specific to DwarfCompileUnit anyway.
llvm-svn: 221060
2014-11-01 19:26:05 +00:00
Daniel Sanders 88e1c7393b [mips] Move all ByVal handling into CCState and tablegen-erated code. NFC.
Summary:
CCState already contains a byval implementation that is very similar to the
Mips custom code. This patch merges the custom code into the existing
common code and tablegen-erated code.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D5977

llvm-svn: 221059
2014-11-01 19:17:10 +00:00
Daniel Sanders 658dc47179 [mips] Fix unused variable warning introduced in r221056
llvm-svn: 221058
2014-11-01 18:53:01 +00:00
Daniel Sanders f43e68793b [mips] Remove ByValArgInfo::Address in favour of CCValAssign::getMemLocOffset(). NFC.
Summary: ByValArgInfo is practically the same as CCState::ByValInfo now.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5976

llvm-svn: 221057
2014-11-01 18:44:56 +00:00
Daniel Sanders eac09608d0 [mips] Move F128 argument handling into MipsCCState as we did for returns. NFC.
Summary:
There are a couple more changes to make before analyzeFormalArguments can
be merged into the standard AnalyzeFormalArguments. I've had to temporarily
poke a couple holes in MipsCCState's encapsulation to save having to make
all the required changes for this merge all at once*. These will be removed
shortly.

* We must merge our ByVal argument handling with the implementation in CCState.
  This will be done over the next three patches, then the fourth will merge
  analyzeFormalArguments with AnalyzeFormalArguments.

Depends on D5967

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5969

llvm-svn: 221056
2014-11-01 18:38:03 +00:00
David Blaikie 7cbf58af15 Sink DwarfUnit::Skeleton down into DwarfCompileUnit
Type units no longer have skeletons and it's misleading to be able to
query for a type unit's skeleton (it might incorrectly lead one to
conclude that if a unit doesn't have a skeleton it's not in a .dwo
file... ).

llvm-svn: 221055
2014-11-01 18:18:07 +00:00
Daniel Sanders 853c2435b6 [mips] Remove MipsCC::CCInfo. NFC.
Summary:
It's now passed in as an argument to functions that need it. Eventually
this argument will be replaced by the 'this' pointer for a MipsCCState
object.

Depends on D5966

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5967

llvm-svn: 221054
2014-11-01 18:13:52 +00:00
Daniel Sanders 068eea2d14 [mips] Removed MipsCC::fixedArgFn(). NFC
Summary:
There is one remaining trace of it in MipsCC::analyzeCallOperands() where
Mips16 might override the calling convention. This will moved into
tablegen-erated code later.

Depends on D5965

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5966

llvm-svn: 221053
2014-11-01 17:44:51 +00:00
Daniel Sanders ca80f1a05a [tablegen] Add CustomCallingConv and use it to tablegen-erate the outermost parts of the Mips O32 implementation
Summary:
CustomCallingConv is simply a CallingConv that tablegen should not generate the
implementation for. It allows regular CallingConv's to delegate to these custom
functions. This is (currently) necessary for Mips and we cannot use CCCustom
without having to adapt to the different API that CCCustom uses.

This brings us a bit closer to being able to remove
MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of
the common implementation.

No functional change to the targets.

Depends on D3341

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: vmedic, llvm-commits

Differential Revision: http://reviews.llvm.org/D5965

llvm-svn: 221052
2014-11-01 17:38:22 +00:00
David Blaikie f6dac29a83 Sink DwarfDebug::AbstractSPDies down into DwarfFile
This is the first big step to allowing gmlt-like inline scope
information in the skeleton CU. While this commit doesn't change the
functionality, it's only a small step to call
"constructAbstractSubprogramDIE" on both the InfoHolder and the
SkeletonHolder (when in use) and that will at least create the abstract
SP dies in that case, though still not creating the other subprograms.

llvm-svn: 221051
2014-11-01 17:21:26 +00:00
Rafael Espindola 246c4fb5d9 Remove redundant calls to isMaterializable.
This removes calls to isMaterializable in the following cases:

* It was redundant with a call to isDeclaration now that isDeclaration returns
  the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.

llvm-svn: 221050
2014-11-01 16:46:18 +00:00
Daniel Sanders a017974b9a Revert r221048 - Test commit
It seems I can't commit unless $DBUS_SESSION_BUS_ADDRESS is set correctly and
it is not set for ssh sessions.

llvm-svn: 221049
2014-11-01 16:08:03 +00:00
Daniel Sanders 5903eb50b1 Test commit
Added some whitespace to debug some authentication issues I'm having.

llvm-svn: 221048
2014-11-01 16:00:40 +00:00
Daniel Sanders 523b1718d4 [JIT] Fix some more missing endian conversions in RuntimeDyld
Summary: This fixes MachO_i386_eh_frame.s on a big-endian Mips host.

Reviewers: lhames

Reviewed By: lhames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6019

llvm-svn: 221047
2014-11-01 15:52:31 +00:00
David Majnemer 549f4f2510 InstCombine: Combine (X+cst) < 0 --> X < -cst
This can happen pretty often in code that looks like:
int foo = bar - 1;
if (foo < 0)
  do stuff

In this case, bar < 1 is an equivalent condition.

This transform requires that the add instruction be annotated with nsw.

llvm-svn: 221045
2014-11-01 09:09:51 +00:00
David Majnemer c758df4053 IR: Restore the old behavior of getDISubprogram
getDISubprogram was mistakenly thought to contain a bug: we thought we
might need to try harder if we found a DebugLoc we didn't find.

llvm-svn: 221044
2014-11-01 07:57:14 +00:00
Adrian Prantl a0852d2be3 Revert "Temporarily revert r220777 to sort out build bot breakage."
This reverts commit r221028. Later commits depend on this and
reverting just this one causes even more bots to fail.

llvm-svn: 221041
2014-11-01 03:19:45 +00:00
NAKAMURA Takumi 2ab9801509 Revert r220779, "[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine."
Since r221028 (reverting r220777), this caused failures.

llvm-svn: 221040
2014-11-01 01:36:14 +00:00
David Blaikie 2f08093dda Remove unused function
llvm-svn: 221037
2014-11-01 01:15:26 +00:00
David Blaikie a8be0a794f And... fix the build some more.
llvm-svn: 221036
2014-11-01 01:15:24 +00:00
David Blaikie 1998a2e789 Just iterate the DwarfCompileUnits rather than trying to filter them out of the list of all units.
llvm-svn: 221034
2014-11-01 01:11:19 +00:00
David Blaikie bbad0bbb2f Add '*' to auto variable that is a pointer, as per the coding conventions.
llvm-svn: 221033
2014-11-01 01:03:39 +00:00
Diego Novillo d5336ae269 Add show and merge tools for sample PGO profiles.
Summary:
This patch extends the 'show' and 'merge' commands in llvm-profdata to handle
sample PGO formats. Using the 'merge' command it is now possible to convert
one sample PGO format to another.

The only format that is currently not working is 'gcc'. I still need to
implement support for it in lib/ProfileData.

The changes in the sample profile support classes are needed for the
merge operation.

Reviewers: bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6065

llvm-svn: 221032
2014-11-01 00:56:55 +00:00
David Blaikie a33cd6aa27 Add DwarfCompileUnit::getSkeleton that returns DwarfCompileUnit* to avoid having to cast from DwarfUnit* on every call.
llvm-svn: 221031
2014-11-01 00:50:34 +00:00
Adrian Prantl cd4872399a Temporarily revert r220777 to sort out build bot breakage.
"[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros."

llvm-svn: 221028
2014-11-01 00:26:59 +00:00
Duncan P. N. Exon Smith 4abd1a0808 IR: MDNode => Value: Instruction::getAllMetadata()
Change `Instruction::getAllMetadata()` to modify a vector of `Value`
instead of `MDNode` and update call sites.  This is part of PR21433.

llvm-svn: 221027
2014-11-01 00:26:42 +00:00
Duncan P. N. Exon Smith 3872d0084c IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.

Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.

llvm-svn: 221024
2014-11-01 00:10:31 +00:00
Duncan P. N. Exon Smith 7c4fc4e5ae IR: MDNode => Value: Add Instruction::getMDNode()
Add `Instruction::getMDNode()` that casts to `MDNode` before changing
`Instruction::getMetadata()` to return `Value`.  This avoids adding
`cast_or_null<MDNode>` boiler-plate throughout the code.

Part of PR21433.

llvm-svn: 221023
2014-10-31 23:58:04 +00:00
Reid Kleckner 6ce76925de Revert "R600: Add missing file to CMakeLists.txt"
This reverts commit r220998.

It should've been reverted with the other change.

llvm-svn: 221021
2014-10-31 23:39:10 +00:00
Reid Kleckner 9abe268adb Revert "R600: Make sure to inline all internal functions"
This reverts commit r220996.

It introduced layering violations causing link errors in many
configurations.

llvm-svn: 221020
2014-10-31 23:35:26 +00:00
Reid Kleckner da00cf5f73 Work around bugs in MSVC "14" CTP 3's conversion logic
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.

As a workaround, add an explicit way to get the MachineInstr.

llvm-svn: 221017
2014-10-31 23:19:46 +00:00
Rafael Espindola 4e27567a12 Refactor duplicated code in liking GlobalValues.
There is quiet a bit of logic that is common to any GlobalValue but was
duplicated for Functions, GlobalVariables and GlobalAliases.

While at it, merge visibility even when comdats are used, fixing pr21415.

llvm-svn: 221014
2014-10-31 23:10:07 +00:00
David Blaikie 1d96cc2ff3 Sink some of DwarfDebug::collectDeadVariables down into DwarfCompileUnit.
llvm-svn: 221010
2014-10-31 22:30:30 +00:00
Michael Zolotukhin 9b9624de0c Correctly update dom-tree after loop vectorizer.
llvm-svn: 221009
2014-10-31 22:28:03 +00:00
David Blaikie 49be5b357b Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE into DwarfCompileUnit
llvm-svn: 221005
2014-10-31 21:57:02 +00:00
Tom Stellard 6693f9cf3d R600: Add IPO to the list of required libraries
llvm-svn: 221004
2014-10-31 21:52:08 +00:00
Lang Hames f04de6ec48 [Object] Modify OwningBinary's interface to separate inspection from ownership.
The getBinary and getBuffer method now return ordinary pointers of appropriate
const-ness. Ownership is transferred by calling takeBinary(), which returns a
pair of the Binary and a MemoryBuffer.

llvm-svn: 221003
2014-10-31 21:37:49 +00:00
Tom Stellard 4ad4177399 R600: Add missing file to CMakeLists.txt
llvm-svn: 220998
2014-10-31 20:56:36 +00:00
Tom Stellard 5b2927fe83 R600: Don't promote allocas when one of the users is a ptrtoint instruction
We need to figure out how to track ptrtoint values all the
way until result is converted back to a pointer in order
to correctly rewrite the pointer type.

llvm-svn: 220997
2014-10-31 20:52:04 +00:00
Tom Stellard aa73831757 R600: Make sure to inline all internal functions
Function calls aren't supported yet.

llvm-svn: 220996
2014-10-31 20:52:02 +00:00
Duncan P. N. Exon Smith 7779b4fd8e IR: Instruction::setMetadata() should use cast_or_null
Not sure why this assertion didn't fire locally [1], but in r220994
`Instruction::setMetadata()` should be using `cast_or_null`.

[1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/12327

llvm-svn: 220995
2014-10-31 20:28:04 +00:00
Duncan P. N. Exon Smith e5d641ebca IR: MDNode => Value: Instruction::setMetadata()
Change `Instruction::setMetadata()` API to accept `Value` instead of
`MDNode`.  Part of PR21433.

llvm-svn: 220994
2014-10-31 20:13:11 +00:00
Bill Schmidt 1ca69fa64d [PowerPC] Initial VSX intrinsic support, with min/max for vector double
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions.  This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.

The LLVM portion of this is quite straightforward.  There is a
companion patch for Clang.

llvm-svn: 220988
2014-10-31 19:19:07 +00:00
Chad Rosier 7bb413e3ba [AArch64] Check Dest Register Liveness in CondOpt pass.
Our internal test reveals such case should not be transformed:

  cmp x17, #3
  b.lt .LBB10_15
  ...
  subs x12, x12, #1
  b.gt .LBB10_1

where x12 is a liveout, becomes:

  cmp x17, #2
  b.le .LBB10_15
  ...
  subs x12, x12, #2
  b.ge .LBB10_1

Unable to provide test case as it's difficult to reproduce on community branch.

http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!

llvm-svn: 220987
2014-10-31 19:02:38 +00:00
Kostya Serebryany ea48bdc702 [asan] do not treat inline asm calls as indirect calls
llvm-svn: 220985
2014-10-31 18:38:23 +00:00
Quentin Colombet c32615dfef [CodeGenPrepare] Move extractelement close to store if they can be combined.
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.


** Context **

Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.


** Motivating Example **

Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
 %in1 = load <2 x i32>* %addr1, align 8
 %extract = extractelement <2 x i32> %in1, i32 1
 %out = or i32 %extract, 1
 store i32 %out, i32* %dest, align 4
 ret void
}

As it is, this IR generates the following assembly on armv7:
  vldr  d16, [r0]            @vector load  
  vmov.32 r0, d16[1]  @ cross-register-file copy: 20 cycles
  orr r0, r0, #1           @ scalar bitwise or
  str r0, [r1]               @ scalar store
  bx  lr

Whereas we could generate much faster code:
  vldr  d16, [r0]               @ vector load
  vorr.i32  d16, #0x1     @ vector bitwise or
  vst1.32 {d16[1]}, [r1:32] @ vector extract + store
  bx  lr

Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.


** Proposed Solution **

To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.

Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).

The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.

For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.

[1] We can imagine targets that can combine an extractelement with  other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.

Differential Revision: http://reviews.llvm.org/D5921

<rdar://problem/14170854>

llvm-svn: 220978
2014-10-31 17:52:53 +00:00
Kostya Serebryany 001ea5fe15 [asan] fix caller-calee instrumentation to emit new cache for every call site
llvm-svn: 220973
2014-10-31 17:11:27 +00:00
David Blaikie 626507fab3 Update the non-pthreads fallback for RWMutex on Unix
Tested this by #if 0'ing out the pthreads implementation, which
indicated that this fallback was not currently compiling successfully
and applying this patch resolves that.

Patch by Andy Chien.

llvm-svn: 220969
2014-10-31 17:02:30 +00:00
David Blaikie fb2efde341 Correct assert text from r220923
Noticed in post-commit review by Adrian Prantl.

llvm-svn: 220967
2014-10-31 16:45:36 +00:00
Rafael Espindola 3e8bc6a8c3 Mark a few variables const. NFC.
llvm-svn: 220964
2014-10-31 16:08:17 +00:00
Chad Rosier a675e550ca [AArch64] CondOpt pass is missing FCMP instructions when searching backward for
a CMP which defines the flags used by B.CC.

http://reviews.llvm.org/D6047
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!

llvm-svn: 220961
2014-10-31 15:17:36 +00:00
Bradley Smith 9992b167ae [SCEV] Improve Scalar Evolution's use of no {un,}signed wrap flags
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.

This allows us to unroll loops such as:

 void testcase3(int v) {
   for (int i=v; i<=v+1; ++i)
     f(i);
 }

llvm-svn: 220960
2014-10-31 11:40:32 +00:00
Ulrich Weigand c8c2ea2854 [PowerPC] Load BlockAddress values from the TOC in 64-bit SVR4 code
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.

The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.

llvm-svn: 220959
2014-10-31 10:33:14 +00:00
David Majnemer c7d7c6fb3a Object, COFF: Cleanup symbol type code, improve binutils compatibility
Do a better job classifying symbols.  This increases the consistency
between the COFF handling code and the ELF side of things.

llvm-svn: 220952
2014-10-31 05:07:00 +00:00
Rafael Espindola 0ae225b6b5 Move definition closer to use. NFC.
llvm-svn: 220949
2014-10-31 04:46:38 +00:00
Hao Liu e02b1a068f PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend.
Initial patch by Oleg Ranevskyy.

llvm-svn: 220945
2014-10-31 02:35:34 +00:00
Ahmed Bougacha 9f336c4ec5 [SelectionDAG] When scalarizing trunc, don't assert for legal operands.
r212242 introduced a legalizer hook, originally to let AArch64 widen
v1i{32,16,8} rather than scalarize, because the legalizer expected, when
scalarizing the result of a conversion operation, to already have
scalarized the operands.  On AArch64, v1i64 is legal, so that commit
ensured operations such as v1i32 = trunc v1i64 wouldn't assert.

It did that by choosing to widen v1 types whenever possible.  However,
v1i1 types, for which there's no legal widened type, would still trigger
the assert.

This commit fixes that, by only scalarizing a trunc's result when the
operand has already been scalarized, and introducing an extract_elt
otherwise.  
This is similar to r205625.

Fixes PR20777.

llvm-svn: 220937
2014-10-30 23:46:50 +00:00
Hans Wennborg 6cda0d7269 Speculative fix for Windows build after r220932
llvm-svn: 220936
2014-10-30 23:10:01 +00:00
Louis Gerbarg e8f9c78247 Fix incorrect invariant check in DAG Combine
Earlier this summer I fixed an issue where we were incorrectly combining
multiple loads that had different constraints such alignment, invariance,
temporality, etc. Apparently in one case I made copt paste error and swapped
alignment and invariance.

Tests included.

rdar://18816719

llvm-svn: 220933
2014-10-30 22:21:03 +00:00
Chris Bieneman 14e2bcccfb Removing the static initializer in ManagedStatic.cpp by using llvm_call_once to initialize the ManagedStatic mutex.
Summary:
This patch adds an llvm_call_once which is a wrapper around std::call_once on platforms where it is available and devoid of bugs. The patch also migrates the ManagedStatic mutex to be allocated using llvm_call_once.

These changes are philosophically equivalent to the changes added in r219638, which were reverted due to a hang on Win32 which was the result of a bug in the Windows implementation of std::call_once.

Reviewers: aaron.ballman, chapuni, chandlerc, rnk

Reviewed By: rnk

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D5922

llvm-svn: 220932
2014-10-30 22:07:09 +00:00
Rafael Espindola d1e64b1e93 Fix the merging of the constantness of declarations.
The langref says:

LLVM explicitly allows declarations of global variables to be marked
constant, even if the final definition of the global is not. This
capability can be used to enable slightly better optimization of the
program, but requires the language definition to guarantee that
optimizations based on the ‘constantness’ are valid for the
translation units that do not include the definition.

Given that definition, when merging two declarations, we have to drop
constantness if of of them is not marked contant, since the Module
without the constant marker might not have the necessary guarantees.

llvm-svn: 220927
2014-10-30 20:50:23 +00:00
Philip Reames 4cb4d3e048 Add handling for range metadata in ValueTracking isKnownNonZero
If we load from a location with range metadata, we can use information about the ranges of the loaded value for optimization purposes.  This helps to remove redundant checks and canonicalize checks for other optimization passes.  This particular patch checks whether a value is known to be non-zero from the range metadata.

Currently, these tests are against InstCombine.  In theory, all of these should be InstSimplify since we're not inserting any new instructions.  Moving the code may follow in a separate change.

Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D5947

llvm-svn: 220925
2014-10-30 20:25:19 +00:00
David Blaikie 76fd43c653 PR21408: Workaround the appearance of duplicate variables due to problems when inlining two calls to the same function from the same call site.
llvm-svn: 220923
2014-10-30 20:20:11 +00:00
Diego Novillo 77a5a5fcda Fix Twine corruption problem with diagnostics.
This fixes the autobuilders I broke with a recent patch. Thanks echristo
and dblaikie for beating me with a clue stick.

llvm-svn: 220918
2014-10-30 18:48:41 +00:00
Diego Novillo c572e92c76 Add profile writing capabilities for sampling profiles.
Summary:
This patch finishes up support for handling sampling profiles in both
text and binary formats. The new binary format uses uleb128 encoding to
represent numeric values. This makes profiles files about 25% smaller.

The profile writer class can write profiles in the existing text and the
new binary format. In subsequent patches, I will add the capability to
read (and perhaps write) profiles in the gcov format used by GCC.

Additionally, I will be adding support in llvm-profdata to manipulate
sampling profiles.

There was a bit of refactoring needed to separate some code that was in
the reader files, but is actually common to both the reader and writer.

The new test checks that reading the same profile encoded as text or
raw, produces the same results.

Reviewers: bogner, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6000

llvm-svn: 220915
2014-10-30 18:00:06 +00:00
Robert Khasanov af318f7073 [AVX512] Added VBROADCAST{SS/SD} encoding for VL subset.
Refactored through AVX512_maskable
        

llvm-svn: 220908
2014-10-30 14:21:47 +00:00
Peter Collingbourne dd3486ece1 [dfsan] New calling convention for custom functions with variadic arguments.
Summary:
The previous calling convention prevented custom functions from being able
to access argument labels unless it knew how many variadic arguments there
were, and of which type. This restriction made it impossible to correctly
model functions in the printf family, as it is legal to pass more arguments
than required to those functions. We now pass arguments in the following order:

non-vararg arguments
labels for non-vararg arguments
[if vararg function, pointer to array of labels for vararg arguments]
[if non-void function, pointer to label for return value]
vararg arguments

Differential Revision: http://reviews.llvm.org/D6028

llvm-svn: 220906
2014-10-30 13:22:57 +00:00
NAKAMURA Takumi 32c87aced3 Untabify.
llvm-svn: 220884
2014-10-29 23:44:35 +00:00
Yi Jiang ab19fff4d8 Do not simplifyLatch for loops where hoisting increments couldresult in extra live range interferance
llvm-svn: 220872
2014-10-29 20:19:47 +00:00
Robert Khasanov 595e598869 [AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP*, VSUBP*, VMULP*, VDIVP*, VMAXP*, VMINP*)
Refactored through AVX512_maskable
Added encoding tests for them.

llvm-svn: 220858
2014-10-29 15:43:02 +00:00
NAKAMURA Takumi f51a34ec1f Whitespace.
llvm-svn: 220857
2014-10-29 15:23:11 +00:00
Michael Kuperstein f9c3480322 Fix build with CMake if LLVM_USE_INTEL_JITEVENTS option is enabled
* Added LLVM libraries required for IntelJITEvents to LLVMBuild.txt.
* Removed 'jit' library from llvm-jitlistener.
* Added support for OptionalLibraries to llvm-build cmake files generator.

Patch by aleksey.a.bader@intel.com

Differential Revision: http://reviews.llvm.org/D5646

llvm-svn: 220848
2014-10-29 09:18:49 +00:00
Peter Zotov 2481c75f8b [C API] PR19859: Add functions to query and modify branches.
Patch by Gabriel Radanne <drupyog@zoho.com>.

llvm-svn: 220817
2014-10-28 19:46:56 +00:00
Peter Zotov 1d98e6ddef [C API] PR19859: Add LLVMGetFCmpPredicate and LLVMConstRealGetDouble.
Patch by Gabriel Radanne <drupyog@zoho.com>.

llvm-svn: 220814
2014-10-28 19:46:44 +00:00
Saleem Abdulrasool d178ada55e Transforms: reapply SVN r219899
This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).

In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds.  Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call.  As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.

llvm-svn: 220811
2014-10-28 18:27:37 +00:00
Robert Khasanov 1cf354c92f [AVX512] Fix VSQRT packed instructions internal names.
No functional change

llvm-svn: 220808
2014-10-28 18:22:41 +00:00
Robert Khasanov eb12639375 [AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset.
Refactored through AVX512_maskable

llvm-svn: 220806
2014-10-28 18:15:20 +00:00
Robert Khasanov 3e534c93b9 [AVX-512] Expanded rsqrt/rcp instructions to VL subset.
Refactored multiclass through AVX512_maskable

llvm-svn: 220783
2014-10-28 16:37:13 +00:00
Robert Khasanov 784c3f0d6e [AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine.
No functional change.

llvm-svn: 220779
2014-10-28 16:17:14 +00:00
Robert Khasanov 4441c4d31b [x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros.
This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case.
Added tests for vselect of <X x i1> values.

llvm-svn: 220777
2014-10-28 15:59:40 +00:00
Aaron Ballman 5af8ba49a6 Silencing an "enumeral and non-enumeral type in conditional expression" warning; NFC.
llvm-svn: 220775
2014-10-28 13:12:13 +00:00
Robert Khasanov dd09a8f320 [AVX512] Bring back vector-shuffle lowering support through broadcasts
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.

Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):

 define   <16 x i32> @_inreg16xi32(i32 %a) {
 ; CHECK-LABEL: _inreg16xi32:
 ; CHECK:       ## BB#0:
-; CHECK-NEXT:    vpbroadcastd %edi, %zmm0
+; CHECK-NEXT:    vmovd %edi, %xmm0
+; CHECK-NEXT:    vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT:    vinserti64x4 $1, %ymm0, %zmm0, %zmm0
 ; CHECK-NEXT:    retq
 %b = insertelement <16 x i32> undef, i32 %a, i32 0
 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
 ret <16 x i32> %c
}

Here, 256-bit broadcast was generated instead of 512-bit one.

In this patch
1) I added vector-shuffle lowering through broadcasts
2) Removed asserts and branches likes because this is incorrect
-  assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
3) Fixed lowering tests

llvm-svn: 220774
2014-10-28 12:28:51 +00:00
NAKAMURA Takumi d0e13af22c Reformat partially, where I touched for whitespace changes.
llvm-svn: 220773
2014-10-28 11:54:52 +00:00
NAKAMURA Takumi 5af50a5470 LoopRerollPass.cpp: Use range-based loop. NFC.
llvm-svn: 220772
2014-10-28 11:54:05 +00:00
NAKAMURA Takumi 335a7bcf1e Untabify and whitespace cleanups.
llvm-svn: 220771
2014-10-28 11:53:30 +00:00
David Blaikie ff468a5e0b Minimize the scope of some variables, NFC.
llvm-svn: 220759
2014-10-28 02:57:26 +00:00
Reid Kleckner 9ccce99e1d X86: Implement the vectorcall calling convention
This is a Microsoft calling convention that supports both x86 and x86_64
subtargets. It passes vector and floating point arguments in XMM0-XMM5,
and passes them indirectly once they are consumed.

Homogenous vector aggregates of up to four elements can be passed in
sequential vector registers, but this part is not implemented in LLVM
and will be handled in Clang.

On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as
integer register parameters and is callee cleanup. On x86_64, it
delegates to the normal win64 calling convention.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D5943

llvm-svn: 220745
2014-10-28 01:29:26 +00:00
Tim Northover 00917897b2 AArch64: enable Cortex-A57 FP balancing on Cortex-A53.
Benchmarks have shown that it's harmless to the performance there, and having a
unified set of passes between the two cores where possible helps big.LITTLE
deployment.

Patch by Z. Zheng.

llvm-svn: 220744
2014-10-28 01:24:32 +00:00
Rafael Espindola 9f8eff31db Remove the PreserveSource linker mode.
I noticed that it was untested, and forcing it on caused some tests to fail:

    LLVM :: Linker/metadata-a.ll
    LLVM :: Linker/prefixdata.ll
    LLVM :: Linker/type-unique-odr-a.ll
    LLVM :: Linker/type-unique-simple-a.ll
    LLVM :: Linker/type-unique-simple2-a.ll
    LLVM :: Linker/type-unique-simple2.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/unnamed-addr1-a.ll
    LLVM :: Linker/visibility1.ll

If it is to be resurrected, it has to be fixed and we should probably have a
-preserve-source command line option in llvm-mc and run tests with and without
it.

llvm-svn: 220741
2014-10-28 00:24:16 +00:00
NAKAMURA Takumi 949fb6d276 AArch64InstrInfo.h: Fix a warning introduced in clang r220703. [-Winconsistent-missing-override]
llvm-svn: 220739
2014-10-27 23:29:27 +00:00
Adam Nemet cf7a4a2660 [AVX512] Add vpermil variable version
This is implemented via a multiclass that derives from the vperm imm
multiclass.

Fixes <rdar://problem/18426089>

llvm-svn: 220737
2014-10-27 23:08:40 +00:00
Adam Nemet 8d85b0cd3e [AVX512] Clean up avx512_perm_imm to use X86VectorVTInfo
No functionality change.  No change in X86.td.expanded except that we only set
the CD8 attributes for the memory variants.  (This shouldn't be used unless we
have a memory operand.)

llvm-svn: 220736
2014-10-27 23:08:37 +00:00
Adam Nemet 9aad13164e [AVX512] Derive vpermil* from avx512_perm_imm
This used to derive from avx512_pshuf_imm which is confusing.

NFC.  Compared X86.td.expanded.

llvm-svn: 220735
2014-10-27 23:08:34 +00:00
Adam Nemet c51cee85b7 [AVX512] Fix copy-and-paste bugs in vpermil
1) i512mem -> f512mem (this is the packed FP input being permuted)
2) element size is 64 bits in EVEX_CD8 for PD.

(A good illustration why X86VectorVTInfo is useful)

llvm-svn: 220734
2014-10-27 23:08:31 +00:00
Rafael Espindola 4160f5d3ac Make it easier to pass a custom diagnostic handler to the IR linker.
llvm-svn: 220732
2014-10-27 23:02:10 +00:00
Pete Cooper 7c801dc90b Fix a stackmap bug introduced in r220710.
For a call to not return in to the stackmap shadow, the shadow must end with the call.

To do this, we must insert any required nops *before* the call, and not after it.

llvm-svn: 220728
2014-10-27 22:38:45 +00:00
Rafael Espindola 5f13d6461b Fix bug where sys::Wait could wait on wrong pid.
Setting ChildPid to -1 would cause waitpid to wait for any child process.

Patch by Daniel Reynaud!

llvm-svn: 220717
2014-10-27 20:30:04 +00:00
Juergen Ributzka 7ccebec668 [FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer check.
This is a minor change to use the immediate version when the operand is a null
value. This should get rid of an unnecessary 'mov' instruction in debug
builds and align the code more with the one generated by SelectionDAG.

This fixes rdar://problem/18785125.

llvm-svn: 220713
2014-10-27 19:58:36 +00:00
Juergen Ributzka 0190fea941 [FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'.
Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and'
instruction.

This fixes rdar://problem/18784953.

llvm-svn: 220712
2014-10-27 19:46:23 +00:00
Pete Cooper 3c0af35232 Stackmap shadows should consider call returns a branch target.
To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets.

A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return.

llvm-svn: 220710
2014-10-27 19:40:35 +00:00
Juergen Ributzka 90f741a2ce [FastISel][AArch64] Use 'cbz' also for null values (pointers).
The pattern matching for a 'ConstantInt' value was too restrictive. Checking for
a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction.

This fixes rdar://problem/18784732.

llvm-svn: 220709
2014-10-27 19:38:05 +00:00
Juergen Ributzka eae91040d8 [FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block.
This fixes a bug where the input register was not defined for the 'tbz/tbnz'
instruction. This happened, because we folded the 'and' instruction from a
different basic block.

This fixes rdar://problem/18784013.

llvm-svn: 220704
2014-10-27 19:16:48 +00:00
Juergen Ributzka 6de054a25a [FastISel][AArch64] Fix load/store with frame indices.
At higher optimization levels the LLVM IR may contain more complex patterns for
loads/stores from/to frame indices. The 'computeAddress' function wasn't able to
handle this and triggered an assertion.

This fix extends the possible addressing modes for frame indices.

This fixes rdar://problem/18783298.

llvm-svn: 220700
2014-10-27 18:21:58 +00:00
Kostya Serebryany 4f8f0c5aa2 [asan] experimental tracing for indirect calls, llvm part.
llvm-svn: 220699
2014-10-27 18:13:56 +00:00
Lang Hames 5fe30ca56f [PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these
sets as keys into a cache of interference matrice values in the Interference
constraint adder.

Creating interference matrices was one of the large remaining time-sinks in
PBQP. Caching them reduces the total compile time (when using PBQP) on the
nightly test suite by ~10%.

llvm-svn: 220688
2014-10-27 17:44:25 +00:00
NAKAMURA Takumi 729be14435 Prune CRLF.
llvm-svn: 220678
2014-10-27 12:37:26 +00:00
Oliver Stannard 79efe41a0c [ARM] Select VMAXNM and VMINNM regardless of operand order
Currently, the ARM backend will select the VMAXNM and VMINNM for these C
expressions:
  (a < b) ? a : b
  (a > b) ? a : b
but not these expressions:
  (a > b) ? b : a
  (a < b) ? b : a

This patch allows all of these expressions to be matched.

llvm-svn: 220671
2014-10-27 09:23:02 +00:00
Yuri Gorshenin 3e22bb8c54 [asan-asm-instrumentation] Added comment describing how asm instrumentation works.
Summary: [asan-asm-instrumentation] Added comment describing how asm instrumentation works.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5970

llvm-svn: 220670
2014-10-27 08:38:54 +00:00
NAKAMURA Takumi 59c74b225a Fix unicode chars into ascii in comment lines.
llvm-svn: 220668
2014-10-27 08:08:18 +00:00
David Majnemer c8bdd23acf InstCombine: Fix a combine assuming that icmp operands were integers
An icmp may have pointer arguments, it isn't limited to integers or
vectors of integers.

This fixes PR21388.

llvm-svn: 220664
2014-10-27 05:47:49 +00:00
Rafael Espindola 18c89411b8 LinkModules.cpp: don't repeat names in comments.
llvm-svn: 220662
2014-10-27 02:35:46 +00:00
David Blaikie df1cca3a24 Remove some unnecessary casts.
llvm-svn: 220658
2014-10-26 23:37:04 +00:00
Arnold Schwaighofer eb1a38fa73 Add an option to the LTO code generator to disable vectorization during LTO
We used to always vectorize (slp and loop vectorize) in the LTO pass pipeline.

r220345 changed it so that we used the PassManager's fields 'LoopVectorize' and
'SLPVectorize' out of the desire to be able to disable vectorization using the
cl::opt flags 'vectorize-loops'/'slp-vectorize' which the before mentioned
fields default to.
Unfortunately, this turns off vectorization because those fields
default to false.
This commit adds flags to the LTO library to disable lto vectorization which
reconciles the desire to optionally disable vectorization during LTO and
the desired behavior of defaulting to enabled vectorization.

We really want tools to set PassManager flags directly to enable/disable
vectorization and not go the route via cl::opt flags *in*
PassManagerBuilder.cpp.

llvm-svn: 220652
2014-10-26 21:50:58 +00:00
Elena Demikhovsky 4b01b7306c AVX-512: Fixed encoding of VPBROADCASTM and added SKX forms of this instruction
llvm-svn: 220638
2014-10-26 09:52:24 +00:00
Andrew Trick dd925ad218 LSR: Minor cleanup after Daniel's patch.
Combine the Inserted an Done sets into a Visited set.

llvm-svn: 220623
2014-10-25 19:59:30 +00:00
Andrew Trick 9ccbed5a12 Fix LSR compile time.
This is a simple fix that brings the compilation time from 5min to 5s
on a specific real-world example. It's a large chain of computation in
a crypto routine (always a problem for SCEV). A unit test is not
feasible and there would be no way to check it. The fix is just basic
good practice for dealing with SCEVs, there's no risk of regression.

Patch by Daniel Reynaud!

llvm-svn: 220622
2014-10-25 19:42:07 +00:00
Jingyue Wu fe72fcebf6 [SeparateConstOffsetFromGEP] Fixed a bug related to unsigned modulo
The dividend in "signed % unsigned" is treated as unsigned instead of signed,
causing unexpected behavior such as -64 % (uint64_t)24 == 0.

Added a regression test in split-gep.ll

Patched by Hao Liu.

llvm-svn: 220618
2014-10-25 18:34:03 +00:00
Benjamin Kramer 63207bc9c3 Clean up assume intrinsic pattern matching, no need to check that the argument is a value.
Also make it const safe and remove superfluous casting. NFC.

llvm-svn: 220616
2014-10-25 18:09:01 +00:00
Jingyue Wu b723152379 [SeparateConstOffsetFromGEP] Fixed a bug in rebuilding OR expressions
The two operands of the new OR expression should be NextInChain and TheOther
instead of the two original operands.

Added a regression test in split-gep.ll.

Hao Liu reported this bug, and provded the test case and an initial patch.
Thanks! 

llvm-svn: 220615
2014-10-25 17:36:21 +00:00
Simon Pilgrim a63672665f [X86][SSE] Vector integer/float conversion memory folding
Tidied up some entries in the folding tables so that they are under the correct comment section (they were categorised as AVX2 instructions when they're AVX1).

Minor patch agreed with qcolombet.

llvm-svn: 220613
2014-10-25 08:11:20 +00:00
David Majnemer 2abb8183b5 InstCombine: Remove overzealous asserts
These asserts can trigger if the worklist iteration order is
sufficiently unlucky.  Instead of adding special case logic to handle
these edge conditions, just bail out on trying to transform them:
InstSimplify will get them when it reaches them on the worklist.

This fixes PR21378.

N.B.  No test case is included because any test would rely on the
fragile worklist iteration order.

llvm-svn: 220612
2014-10-25 07:13:13 +00:00
Rafael Espindola 98ab63ce98 Allow the C API users to keep relying on the OutMessages parameter.
Should fix the Ocaml tests.

llvm-svn: 220611
2014-10-25 04:31:08 +00:00
Rafael Espindola d12b4a334b Update the error handling of lib/Linker.
Instead of passing a std::string&, use the new diagnostic infrastructure.

llvm-svn: 220608
2014-10-25 04:06:10 +00:00
Jingyue Wu ea51161a94 [NVPTX] aligned byte-buffers for vector return types
Summary:
Fixes PR21100 which is caused by inconsistency between the declared return type
and the expected return type at the call site. The new behavior is consistent
with nvcc and the NVPTXTargetLowering::getPrototype function.

Test Plan: test/Codegen/NVPTX/vector-return.ll

Reviewers: jholewinski

Reviewed By: jholewinski

Subscribers: llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5612

llvm-svn: 220607
2014-10-25 03:46:16 +00:00
Evgeniy Stepanov d337a59db5 [msan] Make -msan-check-constant-shadow a bit stronger.
Allow (under the experimental flag) non-Instructions to participate in MSan checks.

llvm-svn: 220601
2014-10-24 23:34:15 +00:00
Rafael Espindola 5a52e6dc9e Modernize the error handling of the Materialize function.
llvm-svn: 220600
2014-10-24 22:50:48 +00:00
Kevin Enderby 2813f496d9 Fix a Mach-O assembler segfault for a subtraction expression with an undefined symbol.
In a Mach-O object file a relocatable expression of the form
SymbolA - SymbolB + constant is allowed when both symbols are
defined in a section.  But when either symbol is undefined it
is an error.

The code was crashing when it had an undefined symbol in this case.
And should have printed a error message using the location information
in the relocation entry.

rdar://18678402

llvm-svn: 220599
2014-10-24 22:39:40 +00:00
Frederic Riss 987fe22789 Sink DwarfUnit::constructImportedEntityDIE into DwarfCompileUnit.
So that it has access to getOrCreateGlobalVariableDIE. If we ever support
decsribing using directive in C++ classes (thus requiring support in type
units), it will certainly use another mechanism anyway.

Differential Revision: http://reviews.llvm.org/D5975

llvm-svn: 220594
2014-10-24 21:31:09 +00:00
Simon Pilgrim fd080af0c5 [X86][SSE] Bitcast assertion in XFormVExtractWithShuffleIntoLoad
Minor patch to fix an issue in XFormVExtractWithShuffleIntoLoad where a load is unary shuffled, then bitcast (to a type with the same number of elements) before extracting an element.

An undef was created for the second shuffle operand using the original (post-bitcasted) vector type instead of the pre-bitcasted type like the rest of the shuffle node - this was then causing an assertion on the different types later on inside SelectionDAG::getVectorShuffle.

Differential Revision: http://reviews.llvm.org/D5917

llvm-svn: 220592
2014-10-24 21:04:41 +00:00
Colin LeMahieu 838307b31f [Hexagon] Resubmission of 220427
Modified library structure to deal with circular dependency between HexagonInstPrinter and HexagonMCInst.
Adding encoding bits for add opcode.
Adding llvm-mc tests.
Removing unit tests.

http://reviews.llvm.org/D5624

llvm-svn: 220584
2014-10-24 19:00:32 +00:00
Matt Arsenault 4b7bd2d52f Fix copy paste comment
llvm-svn: 220581
2014-10-24 18:13:10 +00:00
Rafael Espindola d4bcefc7d9 Don't ever call materializeAllPermanently during LTO.
To do this, change the representation of lazy loaded functions.

The previous representation cannot differentiate between a function whose body
has been removed and one whose body hasn't been read from the .bc file. That
means that in order to drop a function, the entire body had to be read.

llvm-svn: 220580
2014-10-24 18:13:04 +00:00
Sanjay Patel f924e11967 Allow AVX vrsqrtps generation.
This is a follow-on to r220570 that allows a 256-bit (v8f32)
version of vrsqrtps to be generated.

llvm-svn: 220579
2014-10-24 17:59:18 +00:00
David Blaikie 80e5b1ebd1 DebugInfo: Sink DwarfDebug::ScopeVariables down into DwarfFile
(part of refactoring to allow subprogram emission in both the skeleton
and main units to enable -gmlt-like data to be included in the skeleton
for live inlined backtracing purposes)

llvm-svn: 220578
2014-10-24 17:57:34 +00:00
David Blaikie 0dc623c232 Remove DwarfDebug::FirstCU as it has no use
It was only being used as a flag to identify the lack of debug info from
within endModule - use the section labels for that instead.

llvm-svn: 220575
2014-10-24 17:53:38 +00:00
Sanjay Patel 957efc23bb Use rsqrt (X86) to speed up reciprocal square root calcs
This is a first step for generating SSE rsqrt instructions for
reciprocal square root calcs when fast-math is allowed.

For now, be conservative and only enable this for AMD btver2
where performance improves significantly - for example, 29%
on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c
(if we convert the data type to single-precision float).

This patch adds a two constant version of the Newton-Raphson
refinement algorithm to DAGCombiner that can be selected by any target
via a parameter returned by getRsqrtEstimate()..

See PR20900 for more details:
http://llvm.org/bugs/show_bug.cgi?id=20900

Differential Revision: http://reviews.llvm.org/D5658

llvm-svn: 220570
2014-10-24 17:02:16 +00:00
Daniel Sanders e2e25da4b6 [mips] Replace MipsABIEnum with a MipsABIInfo class.
Summary:
No functional change yet, it's just an object replacement for an enum.
It will allow us to gather ABI information in a single place so that we can
start testing for properties of the ABI's instead of the ABI itself.

For example we will eventually be able to use:
  ABI.MinStackAlignmentInBytes()
instead of:
  (isABI_N32() || isABI_N64()) ? 16 : 8
which is clearer and more maintainable.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3341

llvm-svn: 220568
2014-10-24 16:15:27 +00:00
Benjamin Kramer 014601d56e [Object] Fix MachO's getUuid to return a pointer into the object instead of a dangling ArrayRef.
This works because uuid's are always little endian so it's not swapped.
Fixes use-after-return reported by asan.

llvm-svn: 220567
2014-10-24 15:52:05 +00:00
Daniel Sanders f815c137e6 [mips] Fix >80-column line
llvm-svn: 220564
2014-10-24 14:46:00 +00:00
Daniel Sanders 8d69aedbd5 [mips] Remove redundant code in RetCC_MipsN. NFC.
Summary:
i32 is always promoted to i64 so it no longer makes sense to assign i32 to
registers.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5964

llvm-svn: 220561
2014-10-24 13:49:54 +00:00
Daniel Sanders 19f01658fe [mips] For N32/N64, structs must be passed in the upper bits of a register.
Summary:
Most structs were fixed by r218451 but those of between >32-bits and
<64-bits remained broken since they were not marked with [ASZ]ExtUpper.
This patch fixes the remaining cases by using
CCPromoteToUpperBitsInType<i64> on i64's in addition to i32 and smaller.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5963

llvm-svn: 220556
2014-10-24 13:09:19 +00:00
Oliver Stannard f7a5afc3f2 [AArch64] Fix fast-isel of cbz of i1, i8, i16
This fixes a miscompilation in the AArch64 fast-isel which was
triggered when a branch is based on an icmp with condition eq or ne,
and type i1, i8 or i16. The cbz instruction compares the whole 32-bit
register, so values with the bottom 1, 8 or 16 bits clear would cause
the wrong branch to be taken.

llvm-svn: 220553
2014-10-24 09:54:41 +00:00
Marcello Maggioni 2259400f8e Added reset of LexicalScope in LiveDebugVariables reset function.
llvm-svn: 220545
2014-10-24 02:46:50 +00:00
Timur Iskhodzhanov 2bc90fdbdc Fix PR21189 -- Emit symbol subsection required to debug LLVM-built binaries with VS2012+
Reviewed at http://reviews.llvm.org/D5772

llvm-svn: 220544
2014-10-24 01:27:45 +00:00
David Blaikie f4c504e03c DebugInfo: Remove DwarfDebug::addScopeVariable now that it's just a trivial wrapper
llvm-svn: 220542
2014-10-24 00:43:47 +00:00
Adam Nemet 832ec5e911 [AVX512] FMA support for the 231 variants
This is asm/diasm-only support, similar to AVX.

For ISeling the register variant, they are no different from 213 other than
whether the multiplication or the addition operand is destructed.

For ISeling the memory variant, i.e. to fold a load, they are no different
than the 132 variant.  The addition operand (op3) in both cases can come from
memory.  Again the ony difference is which operand is destructed.

There could be a post-RA pass that would convert a 213 or 132 into a 231.

Part of <rdar://problem/17082571>

llvm-svn: 220540
2014-10-24 00:03:00 +00:00
Adam Nemet 26371ce131 [AVX512] Introduce fma3p_forms from AVX
This multiclass generates the different forms: 213, 231, 132 in AVX.

132 in AVX512 is a separate class but I am planning to use this same
multiclass to generate 231 relying on the nice the null_frag trick from AVX to
disable codegen pattern for 231.

No functionality change, no change in X86.td.expanded except for the different
instruction definition names.

llvm-svn: 220539
2014-10-24 00:02:55 +00:00
Nick Lewycky 592d84974c If requested, apply function merging at -O0 too. It's useful there to reduce the time to compile.
llvm-svn: 220537
2014-10-23 23:49:31 +00:00
Timur Iskhodzhanov eb229ca928 Make getDISubprogram(const Function *F) available in LLVM
Reviewed at http://reviews.llvm.org/D5950

llvm-svn: 220536
2014-10-23 23:46:28 +00:00
Ahmed Bougacha 7daf3b89f9 [SelectionDAG] Teach the vector scalarizer about FP conversions.
This adds support for legalization of instructions of the form:

  [fp_conv] <1 x i1> %op to <1 x double>

where fp_conv is one of fpto[us]i, [us]itofp.  This used to assert
because they were simply missing from the vector operand scalarizer.

A similar problem arose in r190830, with trunc instead.

Fixes PR20778.

Differential Revision: http://reviews.llvm.org/D5810

llvm-svn: 220533
2014-10-23 22:49:25 +00:00
Ahmed Bougacha e05afd4d1e Update comment and fix typos in assert message. (NFC)
llvm-svn: 220531
2014-10-23 22:40:34 +00:00
Tim Northover e4c7be56bf ScheduleDAG: record PhysReg dependencies represented by CopyFromReg nodes
x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS
dependency because it was represented by a pair of CopyFromReg(EFLAGS) ->
CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an
implicit-def on the instruction, where the result numbers in the DAG and the
Uses list in TableGen matched up precisely.

The Copy notation seems much more robust, so this patch extends ScheduleDAG
rather than refactoring x86.

Should fix PR20376.

llvm-svn: 220529
2014-10-23 22:31:48 +00:00
David Blaikie 1dd573db45 DebugInfo: Remove DwarfDebug::CurrentFnArguments since we have to handle argument ordering of other arguments (abstract arguments) in the same way and already have code for that too.
While refactoring this code I was confused by both the name I had
introduced (addNonArgumentVariable... but it has all this logic to
handle argument numbering and keep things in order?) and by the
redundancy. Seems when I fixed the misordered inlined argument handling,
I didn't realize it was mostly redundant with the argument ordering code
(which I may've also written, I'm not sure). So let's just rely on the
more general case.

The only oddity in output this produces is that it means when we emit
all the variables for the current function, we don't track when we've
finished the argument variables and are about to start the local
variables and insert DW_AT_unspecified_parameters (for varargs
functions) there. Instead it ends up after the local variables, scopes,
etc. But this isn't invalid and doesn't cause DWARF consumers problems
that I know of... so we'll just go with that because it makes the code
nice & simple.

(though, let's see what the buildbots have to say about this - *crosses
fingers*)

There will be some cleanup commits to follow to remove the now trivial
wrappers, etc.

llvm-svn: 220527
2014-10-23 22:27:50 +00:00
David Blaikie 3b5c84008d DebugInfo: Sink DwarfDebug::addNonArgumentScopeVariable into DwarfFile.
llvm-svn: 220520
2014-10-23 22:04:30 +00:00
Ahmed Bougacha 5175bcf43a [X86] Improve mul w/ overflow codegen, to MUL8+SETO.
Currently, @llvm.smul.with.overflow.i8 expands to 9 instructions, where
3 are really needed.

This adds X86ISD::UMUL8/SMUL8 SD nodes, and custom lowers them to
MUL8/IMUL8 + SETO.

i8 is a special case because there is no two/three operand variants of
(I)MUL8, so the first operand and return value need to go in AL/AX.

Also, we can't write patterns for these instructions: TableGen refuses
patterns where output operands don't match SDNode results. In this case,
instructions where the output operand is an implicitly defined register.

A related special case (and FIXME) exists for MUL8 (X86InstrArith.td):

  // FIXME: Used for 8-bit mul, ignore result upper 8 bits.
  // This probably ought to be moved to a def : Pat<> if the
  // syntax can be accepted.
  [(set AL, (mul AL, GR8:$src)), (implicit EFLAGS)]

Ideally, these go away with UMUL8, but we still need to improve TableGen
support of implicit operands in patterns.

Before this change:
  movsbl  %sil, %eax
  movsbl  %dil, %ecx
  imull   %eax, %ecx
  movb    %cl, %al
  sarb    $7, %al
  movzbl  %al, %eax
  movzbl  %ch, %esi
  cmpl    %eax, %esi
  setne   %al

After:
  movb    %dil, %al
  imulb   %sil
  seto    %al

Also, remove a made-redundant testcase for PR19858, and enable more FastISel
ALU-overflow tests for SelectionDAG too.

Differential Revision: http://reviews.llvm.org/D5809

llvm-svn: 220516
2014-10-23 21:55:31 +00:00
David Blaikie ff4181adec DebugInfo: Remove DwarfDebug::addCurrentFnArgument declaration now that it's moved to DwarfFile.
llvm-svn: 220515
2014-10-23 21:53:17 +00:00
Sanjay Patel 848309da7c Handle sqrt() shrinking in SimplifyLibCalls like any other call
This patch removes a chunk of special case logic for folding 
(float)sqrt((double)x) -> sqrtf(x)
in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls.

No functional change intended, but I loosened the restriction on the existing
sqrt testcases to allow for this optimization even without unsafe-fp-math because
that's the existing behavior.

I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic
in case the result is used as a double.

Differential Revision: http://reviews.llvm.org/D5919

llvm-svn: 220514
2014-10-23 21:52:45 +00:00
Kevin Enderby 6f326ce75b Update llvm-objdump’s Mach-O symbolizer code for Objective-C references.
This prints disassembly comments for Objective-C references to CFStrings,
Selectors, Classes and method calls.

llvm-svn: 220500
2014-10-23 19:37:31 +00:00
David Blaikie 49cfc8ca8e DebugInfo: Simplify/tidy/correct global variable decl/def emission handling.
This fixes a bug (introduced by fixing the IR emitted from Clang where
the definition of a static member would be scoped within the class,
rather than within its lexical decl context) where the definition of a
static variable would be placed inside a class.

It also improves source fidelity by scoping static class member
definitions inside the lexical decl context in which tehy are written
(eg: namespace n { class foo { static int i; } int foo::i; } - the
definition of 'i' will be within the namespace 'n' in the DWARF output
now).

Lastly, and the original goal, this reduces debug info size slightly
(and makes debug info easier to read, etc) by placing the definitions of
non-member global variables within their namespace, rather than using a
separate namespace-scoped declaration along with a definition at global
scope.

Based on patches and discussion with Frédéric.

llvm-svn: 220497
2014-10-23 19:12:43 +00:00
Reid Kleckner 5b2787dfb2 Revert "Don't count inreg params when mangling fastcall functions"
This reverts commit r214981.

I'm not sure what I was thinking when I wrote this. Testing with MSVC
shows that this function is mangled to '@f@8':
  int __fastcall f(int a, int b);

llvm-svn: 220492
2014-10-23 17:50:42 +00:00
David Blaikie bcfa8a56b1 Remove explicit (void) use of DwarfFile::DD that was accidentally left in r220452.
Caught in post-commit review by Frédéric.

llvm-svn: 220487
2014-10-23 16:12:58 +00:00
Renato Golin 6fb9c2ea70 Do not emit intermediate register for zero FP immediate
This updates check for double precision zero floating point constant to allow
use of instruction with immediate value rather than temporary register.
Currently "a == 0.0", where "a" is of "double" type generates:

vmov.i32        d16, #0x0
vcmpe.f64       d0, d16

With this change it becomes:

vcmpe.f64        d0, #0

Patch by Sergey Dmitrouk.

llvm-svn: 220486
2014-10-23 15:31:50 +00:00
Rafael Espindola 1b47a28ff9 clang-format two code snippets to make the next patch easy to read.
llvm-svn: 220484
2014-10-23 15:20:05 +00:00
NAKAMURA Takumi 5b6f789d7a Hexagon/Disassembler/LLVMBuild.txt: Update libdeps.
llvm-svn: 220482
2014-10-23 11:32:16 +00:00
NAKAMURA Takumi f459febb15 Hexagon/LLVMBuild.txt: Prune CRLF.
llvm-svn: 220481
2014-10-23 11:32:03 +00:00
NAKAMURA Takumi bd20251a4a [CMake] Prune CRLF in CMakeLists.txt(s).
llvm-svn: 220480
2014-10-23 11:31:50 +00:00
NAKAMURA Takumi 504bbf91cd Revert r220427, "[Hexagon] Adding encoding bits for add opcode."
It brought cyclic dependecy between HexagonAsmPrinter and HexagonDesc.

llvm-svn: 220478
2014-10-23 11:31:22 +00:00
Zoran Jovanovic 42b8444372 [mips][microMIPS] Implement ADDIUR1SP instruction
Differential Revision: http://reviews.llvm.org/D5153

llvm-svn: 220477
2014-10-23 11:13:59 +00:00
Zoran Jovanovic bac3619b29 ps][microMIPS] Implement ADDIUR2 instruction
Differential Revision: http://reviews.llvm.org/D5151

llvm-svn: 220476
2014-10-23 11:06:34 +00:00
Zoran Jovanovic 9bda2f1926 ps][microMIPS] Implement LI16 instruction
Differential Revision: http://reviews.llvm.org/D5149

llvm-svn: 220475
2014-10-23 10:59:24 +00:00
Zoran Jovanovic 4a00fdc2e3 [mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions
Differential Revision: http://reviews.llvm.org/D5774

llvm-svn: 220474
2014-10-23 10:42:01 +00:00
Oliver Stannard 39a85abddf [Thumb2] Improve disassembly of memory hints
Currently, the ARM disassembler will disassemble the Thumb2 memory hint
instructions (PLD, PLDW and PLI), even for targets which do not have
these instructions. This patch adds the required checks to the
disassmebler.

llvm-svn: 220472
2014-10-23 08:52:58 +00:00
Akira Hatanaka 2ee0e9e6ee [ARM, stack protector] If supported, use armv7 instructions.
This commit enables using movt/movw to load the stack guard address:

movw r0, :lower16:(L_g3$non_lazy_ptr-(LPC0_0+8))
movt r0, :upper16:(L_g3$non_lazy_ptr-(LPC0_0+8))
ldr r0, [pc, r0]

Previously a pc-relative load was emitted:

ldr r0, LCPI0_0
ldr r0, [pc, r0]

rdar://problem/18740489

llvm-svn: 220470
2014-10-23 04:17:05 +00:00
Frederic Riss c1892e2d48 Assert that ValueHandleBase::ValueIsRAUWd doesn't change the tracked Value type.
This invariant is enforced in Value::replaceAllUsesWith, thus it seems
logical to apply it also to ValueHandles. This commit fixes InstCombine
to not trigger the assertion during the removal of constant bitcasts in
call instructions.

Differential Revision: http://reviews.llvm.org/D5828

llvm-svn: 220468
2014-10-23 04:08:42 +00:00
Frederic Riss 05ad2e543f Modernize doxygen comments in Support/Dwarf.h
In post-commit review of r219442, Rafael pointed out that the comment style
of the newly introduced helper didn't follow LLVM's coding standard.
Modernize the whole file to the new standards.

Differential Revision: http://reviews.llvm.org/D5918

llvm-svn: 220467
2014-10-23 04:08:38 +00:00
Frederic Riss e939b43aa4 [dwarfdump] Dump DW_AT_ranges values inline in the debug_info dump.
The output looks like that:
                      DW_AT_ranges [FORM_data4]    (0x00000000
                         [0x00000001000024a0 - 0x00000001000024c2)
                         [0x0000000100002505 - 0x000000010000268b))

Differential Revision: http://reviews.llvm.org/D5712

llvm-svn: 220466
2014-10-23 04:08:34 +00:00
Evgeniy Stepanov 7db296eba5 [msan] Emit checks for constant shadow values under an experimental flag.
Does not change the default behavior.

llvm-svn: 220457
2014-10-23 01:05:46 +00:00
David Blaikie f299947bfa [DebugInfo] Sink DwarfDebug::addCurrentFnArgument down into DwarfFile.
Variable handling will be sunk into DwarfFile so that abstract variables
and the like can be shared across multiple CUs (to handle cross-CU
inlining, for example).

llvm-svn: 220453
2014-10-23 00:16:05 +00:00
David Blaikie 2b22b1e9a2 [DebugInfo] Add DwarfDebug& to DwarfFile.
Use the DwarfDebug in one function that previously took it as a
parameter, and lay the foundation for use this for other operations
coming soon.

llvm-svn: 220452
2014-10-23 00:16:03 +00:00
David Blaikie 263a008525 [DebugInfo] Remove LexicalScopes::isCurrentFunctionScope and CSE a use of LexicalScopes::getCurrentFunctionScope
Now that we're sure the only root (non-abstract) scope is the current
function scope, there's no need for isCurrentFunctionScope, the property
can be tested directly instead.

llvm-svn: 220451
2014-10-23 00:06:27 +00:00
Lang Hames efe7e22673 [MCJIT] Make repeat calls to MCJIT::getPointerToFunction for declarations safe.
MCJIT::getPointerForFunction adds the resulting address to the global mapping.
This should be done via updateGlobalMapping rather than addGlobalMapping, since
the latter asserts if a mapping already exists.

MCJIT::getPointerToFunction is actually deprecated - hopefully we can remove it
(or more likely re-task it) entirely soon. In the mean time it should at least
work as advertised.

<rdar://problem/18727946>

llvm-svn: 220444
2014-10-22 23:18:42 +00:00
David Majnemer ac7dc6e701 Attempt to fix the build after r220439
llvm-svn: 220440
2014-10-22 22:46:05 +00:00
Derek Schuff 5f708e5ec8 [MC] Attach labels to existing fragments instead of using a separate fragment
Summary:
Currently when emitting a label, a new data fragment is created for it if the
current fragment isn't a data fragment.
This change instead enqueues the label and attaches it to the next fragment
(e.g. created for the next instruction) if possible.

When bundle alignment is not enabled, this has no functionality change (it
just results in fewer extra fragments being created). For bundle alignment,
previously labels would point to the beginning of the bundle padding instead
of the beginning of the emitted instruction. This was not only less efficient
(e.g. jumping to the nops instead of past them) but also led to miscalculation
of the address of the GOT (since MC uses a label difference rather than
emitting a "." symbol).

Fixes https://code.google.com/p/nativeclient/issues/detail?id=3982

Test Plan: regression test attached

Reviewers: jvoung, eliben

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D5915

llvm-svn: 220439
2014-10-22 22:38:06 +00:00
Colin LeMahieu 73a51a1a68 [Hexagon] Adding encoding bits for add opcode.
Adding llvm-mc tests.
Removing unit tests.

http://reviews.llvm.org/D5624

llvm-svn: 220427
2014-10-22 20:58:35 +00:00
Chad Rosier dcd2a3014c [AArch64] Add support for the .inst directive.
This has been implement using the MCTargetStreamer interface as is done in the
ARM, Mips and PPC backends.

Phabricator: http://reviews.llvm.org/D5891
PR20964

llvm-svn: 220422
2014-10-22 20:35:57 +00:00
Benjamin Kramer 7ad22403fb Strength reduce constant-sized vectors into arrays. No functionality change.
llvm-svn: 220412
2014-10-22 19:55:26 +00:00
Benjamin Kramer 26ce8ff637 LoopVectorize: Simplify code. No functionality change.
llvm-svn: 220405
2014-10-22 19:13:54 +00:00
Diego Novillo 19e7b7e27c Shorten auto iterators for function basic blocks.
Use consistent naming for basic block instances.

No functional changes.

llvm-svn: 220404
2014-10-22 18:39:50 +00:00
Hans Wennborg db08566588 Fix VS2012 build; C++11 type aliases are not supported.
llvm-svn: 220399
2014-10-22 17:47:49 +00:00
Colin LeMahieu b424cb1e57 Ammending 220393 - Removing unused decoding tables.
llvm-svn: 220397
2014-10-22 17:23:01 +00:00
Colin LeMahieu 9950d5c59a Ammending 220393 - Removing unused functions.
llvm-svn: 220396
2014-10-22 17:03:19 +00:00
Bill Schmidt 9c54bbd791 [PATCH] Support select-cc for VSFRC when VSX is enabled
A previous patch enabled SELECT_VSRC and SELECT_CC_VSRC for VSX to
handle <2 x double> cases.  This patch adds SELECT_VSFRC and
SELECT_CC_VSFRC to allow use of all 64 vector-scalar registers for the
f64 type when VSX is enabled.  The changes are analogous to those in
the previous patch.  I've added a new variant to vsx.ll to test the
code generation.

(I also cleaned up a little formatting in PPCInstrVSX.td from the
previous patch.)

llvm-svn: 220395
2014-10-22 16:58:20 +00:00
Diego Novillo b368b7d558 Use auto iteration in lib/Transforms/Scalar/SampleProfile.cpp. No functional changes.
llvm-svn: 220394
2014-10-22 16:51:50 +00:00
Colin LeMahieu 88ebb9e2da [Hexagon] Adding basic disassembler.
Marking all instructions as CodeGenOnly since encoding bits are not set yet.
http://reviews.llvm.org/D5829?vs=on&id=15023&whitespace=ignore-all#toc

llvm-svn: 220393
2014-10-22 16:49:14 +00:00
Philip Reames d92c2a7592 Preserving 'nonnull' metadata in SimplifyCFG
When we hoist two loads above an if, we can preserve the nonnull metadata.  We could also do the same for sinking them, but we appear to not handle metadata at all in that case.

Thanks to Hal for the review.

Differential Revision: http://reviews.llvm.org/D5910

llvm-svn: 220392
2014-10-22 16:37:13 +00:00
Sanjay Patel a92fa44740 Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850)
When a call to a double-precision libm function has fast-math semantics 
(via function attribute for now because there is no IR-level FMF on calls), 
we can avoid fpext/fptrunc operations and use the float version of the call
if the input and output are both float.

We already do this optimization using a command-line option; this patch just
adds the ability for fast-math to use the existing functionality.

I moved the cl::opt from InstructionCombining into SimplifyLibCalls because
it's only ever used internally to that class.

Modified the existing test cases to use the unsafe-fp-math attribute rather
than repeating all tests.

This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850

Differential Revision: http://reviews.llvm.org/D5893

llvm-svn: 220390
2014-10-22 15:29:23 +00:00
Diego Novillo a67c0b43e1 Change error to warning when a profile cannot be found.
When the profile for a function cannot be applied, we use to emit an
error. This seems extreme. The compiler can continue, it's just that the
optimization opportunities won't include profile information.

llvm-svn: 220386
2014-10-22 13:36:35 +00:00
Bill Schmidt 61e652334f [PowerPC] Support select-cc for VSX
The tests test/CodeGen/Generic/select-cc.ll and
test/CodeGen/PowerPC/select-cc.ll both fail with VSX enabled.  The
problem is that the lowering logic for the SELECT and SELECT_CC
operations doesn't currently support the VSX registers.  This patch
fixes that.

In lib/Target/PowerPC/PPCInstrInfo.td, we have pseudos to handle this
for other register classes.  Similar pseudos are added in
PPCInstrVSX.td (they must be there, because the "vsrc" register class
definition appears there) for the VSRC register class.  The
SELECT_VSRC pseudo is then used in pattern matching for SELECT_CC.

The rest of the patch just adds logic for SELECT_VSRC wherever similar
logic appears for SELECT_VRRC.

There are no new test cases because the existing tests above test
this, along with a variant in test/CodeGen/PowerPC/vsx.ll.

After discussion with Hal, a future patch will add similar _VSFRC
variants to override f64 type handling (currently using F8RC).

llvm-svn: 220385
2014-10-22 13:13:40 +00:00
Diego Novillo 8027b80b41 Support using sample profiles with partial debug info.
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.

But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.

This patch adds a new way of looking for the start of the current
function. Instead of looking through the compile units in llvm.dbg.cu,
we can walk up the scope for the first instruction in the function with
a debug loc. If that describes the function, we use it. Otherwise, we
keep looking until we find one.

If no such instruction is found, we then give up and produce an error.

Reviewers: echristo, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5887

llvm-svn: 220382
2014-10-22 12:59:00 +00:00
Arnaud A. de Grandmaison 9b3330546b [AArch64] Cleanup A57PBQPConstraints
And add a long awaited testcase.

llvm-svn: 220381
2014-10-22 12:40:20 +00:00
Bruno Cardoso Lopes c29520c5b3 [InstSimplify] Support constant folding to vector of pointers
ConstantFolding crashes when trying to InstSimplify the following load:

@a = private unnamed_addr constant %mst {
     i8* inttoptr (i64 -1 to i8*),
     i8* inttoptr (i64 -1 to i8*)
}, align 8

%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8

This patch fix this by adding support to this type of folding:

%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8
==> gets folded to:
  %x = <2 x i8*> <i8* inttoptr (i64 -1 to i8*), i8* inttoptr (i64 -1 to i8*)>

llvm-svn: 220380
2014-10-22 12:18:48 +00:00
Jyoti Allur 3b68607eac [Thumb/Thumb2] Implement restrictions on SP in register list on LDM, STM variants in thumb mode
llvm-svn: 220379
2014-10-22 10:41:14 +00:00
Matt Arsenault 8226ca29e2 Fix typo
llvm-svn: 220353
2014-10-22 00:28:59 +00:00
Evgeniy Stepanov 35eb265421 [msan] Handle param-tls overflow.
ParamTLS (shadow for function arguments) is of limited size. This change
makes all arguments that do not fit unpoisoned, and avoids writing
past the end of a TLS buffer.

llvm-svn: 220351
2014-10-22 00:12:40 +00:00
Hans Wennborg 0b39fc0d16 Revert "Teach the load analysis to allow finding available values which require" (r220277)
This seems to have caused PR21330.

llvm-svn: 220349
2014-10-21 23:49:52 +00:00
Lang Hames 41d95947cf [MCJIT] Defer application of AArch64 MachO GOT relocations until resolve time.
On AArch64, GOT references are page relative (ADRP + LDR), so they can't be
applied until we know exactly where, within a page, the GOT entry will be in
the target address space.

Fixes <rdar://problem/18693976>.

llvm-svn: 220347
2014-10-21 23:41:15 +00:00
JF Bastien f42a6ea5ac LTO: respect command-line options that disable vectorization.
Summary: Patches 202051 and 208013 added calls to LTO's PassManager which unconditionally add LoopVectorizePass and SLPVectorizerPass instead of following the logic in PassManagerBuilder::populateModulePassManager and honoring the -vectorize-loops -run-slp-after-loop-vectorization flags.

Reviewers: nadav, aschwaighofer, yijiang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5884

llvm-svn: 220345
2014-10-21 23:18:21 +00:00
Matt Arsenault 7c93690be0 Add minnum / maxnum codegen
llvm-svn: 220342
2014-10-21 23:01:01 +00:00
Matt Arsenault d6511b49ac Add minnum / maxnum intrinsics
These are named following the IEEE-754 names for these
functions, rather than the libm fmin / fmax to avoid
possible ambiguities. Some languages may implement something
resembling fmin / fmax which return NaN if either operand is
to propagate errors. These implement the IEEE-754 semantics
of returning the other operand if either is a NaN representing
missing data.

llvm-svn: 220341
2014-10-21 23:00:20 +00:00
Duncan P. N. Exon Smith 44e5b4e533 IR: Reorder metadata bitcode serialization, NFC
Enumerate `MDNode`'s operands *before* the node itself, so that the
reader requires less RAUW.  Although this will cause different code
paths to be hit in the reader, this should effectively be no
functionality change.

llvm-svn: 220340
2014-10-21 22:27:47 +00:00
Matt Arsenault 75c658e2cc R600/SI: Add missing parameter to div_fmas intrinsic
llvm-svn: 220338
2014-10-21 22:20:55 +00:00
Duncan P. N. Exon Smith 60d87e7253 IR: Remove dead code in metadata bitcode writing, NFC
No one cares how many uses each metadata value has, so don't bother
counting.

llvm-svn: 220337
2014-10-21 22:13:34 +00:00
Arnaud A. de Grandmaison 5c7fe7e97b Pacify bots and simplify r220321
llvm-svn: 220335
2014-10-21 21:50:49 +00:00
Matt Arsenault 8c4fb7cae0 R600: Use default GlobalDirective
The overridden one wasn't inserting a space,
so you would end up with .globalfoo

llvm-svn: 220329
2014-10-21 21:08:36 +00:00
Philip Reames d7c21364a9 Teach combineMetadata how to merge 'nonnull' metadata.
combineMetadata is used when merging two instructions into one.  This change teaches it how to merge 'nonnull' - i.e. only preserve it on the new instruction if it's set on both sources.  This isn't actually used yet since I haven't adjusted any of the call sites to pass in nonnull as a 'known metadata'.  

llvm-svn: 220325
2014-10-21 21:02:19 +00:00
Philip Reames b2d3f035e2 Preserve 'nonnull' when changing type of the load.
When changing the type of a load in Chandler's recent InstCombine changes, we can preserve the new 'nonnull' metadata.  

I considered adding an assert since 'nonnull' is only valid on pointer types, but casting a pointer to a non-pointer would involve more than a bitcast anyways.  If someone extends this transform to handle more than bitcasts, the verifier will report the malformed IR, so a separate assertion isn't needed.  Also, the fpmath flags would have the same problem.

llvm-svn: 220324
2014-10-21 21:00:03 +00:00
Philip Reames 0ca58b33cf Extend the verifier to check usage of 'nonnull' metadata.
The recently added !nonnull metadata is only valid on loads of pointer type.  

llvm-svn: 220323
2014-10-21 20:56:29 +00:00
Arnaud A. de Grandmaison a61262f989 [PBQP] Teach PassConfig to tell if the default register allocator is used.
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.

llvm-svn: 220321
2014-10-21 20:47:22 +00:00
David Majnemer d205602a0b InstCombine: Simplify FoldICmpCstShrCst
This function was complicated by the fact that it tried to perform
canonicalizations that were already preformed by InstSimplify.  Remove
this extra code and move the tests over to InstSimplify.  Add asserts to
make sure our preconditions hold before we make any assumptions.

llvm-svn: 220314
2014-10-21 19:51:55 +00:00
Rafael Espindola f03ae4efa7 Drop support for an old version of ld64 (from darwin 9).
llvm-svn: 220310
2014-10-21 18:31:09 +00:00
Sanjay Patel d5aa255146 remove function names from comments; NFC
llvm-svn: 220309
2014-10-21 18:26:57 +00:00
Matt Arsenault e306a32325 R600/SI: Add pattern for bswap
llvm-svn: 220304
2014-10-21 16:25:08 +00:00
Arnaud A. de Grandmaison d3648d0cbe [PBQP] Fix coalescing benefits
As coalescing registers is a benefit, the cost should be improved (i.e. made smaller) when coalescing is possible.

llvm-svn: 220302
2014-10-21 16:24:15 +00:00
NAKAMURA Takumi 9ff272f382 X86AsmInstrumentation.cpp: Dissolve initializer-ranged-for. MSC17 disliked it.
llvm-svn: 220301
2014-10-21 16:22:52 +00:00
Colin LeMahieu 7055365e77 Test commit
Fixing brief comment.

llvm-svn: 220299
2014-10-21 16:03:10 +00:00
Bill Schmidt 5c6cb813b6 [PowerPC] Avoid VSX FMA mutate when killed product reg = addend reg
With VSX enabled, test/CodeGen/PowerPC/recipest.ll exposes a bug in
the FMA mutation pass.  If we have a situation where a killed product
register is the same register as the FMA target, such as:

   %vreg5<def,tied1> = XSNMSUBADP %vreg5<tied0>, %vreg11, %vreg5,
                       %RM<imp-use>; VSFRC:%vreg5 F8RC:%vreg11 

then the substitution makes no sense.  We end up getting a crash when
we try to extend the interval associated with the killed product
register, as there is already a live range for %vreg5 there.  This
patch just disables the mutation under those circumstances.

Since recipest.ll generates different code with VMX enabled, I've
modified that test to use -mattr=-vsx.  I've borrowed the code from
that test that exposed the bug and placed it in fma-mutate.ll, where
it tests several mutation opportunities including the "bad" one.

llvm-svn: 220290
2014-10-21 13:02:37 +00:00
Oliver Stannard cdb8db8d3c [ARM] NEON 32-bit scalar moves are also available in VFPv2
The 32-bit variants of the NEON scalar<->GPR move instructions are
also available in VFPv2. The 8- and 16-bit variants do require NEON.

Note that the checks in the test file are all -DAG because they are
checking a mixture of stdout and stderr, and the ordering is not
guaranteed.

llvm-svn: 220288
2014-10-21 11:49:14 +00:00
Yuri Gorshenin 171eb8dbeb [asan-asm-instrumentation] Fixed memory accesses with rbp as a base or an index register.
Summary: Fixed memory accesses with rbp as a base or an index register.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5819

llvm-svn: 220283
2014-10-21 10:22:27 +00:00
Oliver Stannard 38e6d45a46 [Thumb2] LDRS?[BH] cannot load to the PC
The Thumb2 LDRS?[BH] instructions are not valid when the destination
register is the PC (these encodings are used for preload hints).

llvm-svn: 220278
2014-10-21 09:14:15 +00:00
Chandler Carruth aa72a6dd3b Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.

To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.

These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.

I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.

llvm-svn: 220277
2014-10-21 09:00:40 +00:00
Zoran Jovanovic 592239d498 [mips][microMIPS] Implement ADDU16 and SUBU16 instructions
Differential Revision: http://reviews.llvm.org/D5118

llvm-svn: 220276
2014-10-21 08:44:58 +00:00
Zoran Jovanovic 81ceebc56e [mips][microMIPS] Implement AND16, NOT16, OR16 and XOR16 instructions
Differential Revision: http://reviews.llvm.org/D5117

llvm-svn: 220275
2014-10-21 08:32:40 +00:00
Zoran Jovanovic b0852e5410 [mips][microMIPS] Implement microMIPS 16-bit instructions registers
Differential Revision: http://reviews.llvm.org/D5116

llvm-svn: 220273
2014-10-21 08:23:11 +00:00
Rafael Espindola c606bfe660 Fix a bit of confusion about .set and produce more readable assembly.
Every target we support has support for assembly that looks like

a = b - c
.long a

What is special about MachO is that the above combination suppresses the
production of a relocation.

With this change we avoid producing the intermediary labels when they don't
add any value.

llvm-svn: 220256
2014-10-21 01:17:30 +00:00
Paul Robinson f60e0a160f Do not attribute static allocas to the call site's DebugLoc.
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.

Patch by Wolfgang Pieb!

llvm-svn: 220255
2014-10-21 01:00:55 +00:00
David Blaikie df9515324d PR21202: Memory leak in Windows RWMutexImpl when using SRWLOCK
llvm-svn: 220251
2014-10-21 00:34:39 +00:00
Rafael Espindola 74dd8547db Make AsmPrinter::EmitLabelOffsetDifference a static helper and simplify.
It had exactly one caller in a position where we know hasSetDirective is true.

llvm-svn: 220250
2014-10-21 00:25:49 +00:00
Lang Hames 2d0d096bd1 [MCJIT] Temporarily revert r220245 - it broke several bots.
(See e.g. http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/17653)

llvm-svn: 220249
2014-10-21 00:24:02 +00:00
Philip Reames 5a3f5f751b Introduce enum values for previously defined metadata types. (NFC)
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well.  Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison.  This change adds enum value for three existing metadata types:
+    MD_nontemporal = 9, // "nontemporal"
+    MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access"
+    MD_nonnull = 11 // "nonnull"

I went through an updated various uses as well.  I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate.  For example, there were several items in LoopInfo.cpp I chose not to update.

llvm-svn: 220248
2014-10-21 00:13:20 +00:00
Philip Reames bf9676f7f0 Extend the verifier to validate range metadata on calls and invokes.
Range metadata applies to loads, call, and invokes.  We were validating that metadata applied to loads was correct according to the LangRef, but we were not validating metadata applied to calls or invokes.  This change extracts the checking functionality to a common location, reuses it for all valid locations, and adds a simple test to ensure a misused range on a call gets reported.

llvm-svn: 220246
2014-10-20 23:52:07 +00:00
Lang Hames 84801c217c [MCJIT] Make MCJIT honor symbol visibility settings when populating the global
symbol table.

Patch by Anthony Pesch. Thanks Anthony!

llvm-svn: 220245
2014-10-20 23:39:54 +00:00
Quentin Colombet 06355199f1 [X86] Fix a bug in the lowering of the mask of VSELECT.
X86 code to lower VSELECT messed a bit with the bits set in the mask of VSELECT
when it knows it can be lowered into BLEND. Indeed, only the high bits need to be
set for those and it optimizes those accordingly.
However, when the mask is a compile time constant, the lowering will be handled
by the generic optimizer and those modifications will generate bad code in the
generic optimizer.

This patch fixes that by preventing the optimization if the VSELECT will be
handled by the generic optimizer.

<rdar://problem/18675020>

llvm-svn: 220242
2014-10-20 23:13:30 +00:00
Philip Reames cdb72f369f Introduce a 'nonnull' metadata on Load instructions.
The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns.  Long term, it would be nice to combine these into a single construct.   The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull.

Reviewed by: Hal Finkel
Differential Revision:  http://reviews.llvm.org/D5220

llvm-svn: 220240
2014-10-20 22:40:55 +00:00
Simon Pilgrim 2f9548a3ef [X86] Memory folding for commutative instructions (updated)
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.

Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt.

Added additional regression test case provided by Joerg Sonnenberger.

Differential Revision: http://reviews.llvm.org/D5818

llvm-svn: 220239
2014-10-20 22:14:22 +00:00
Tim Northover 23075ccee7 ARM: rework Thumb1 frame index rewriting
The previous code had a few problems, motivating the choices here.

1. It could create instructions clobbering CPSR, but the incoming MachineInstr
   didn't reflect this. A potential source of corruption. This is why the patch
   has a new PseudoInst for before lowering.
2. Similarly, there was some code to handle the incoming instruction not being
   ARMCC::AL, but this would have caused massive problems if it was actually
   invoked when a complex offset needing more than one instruction was requested.
3. It wasn't designed to handle unaligned pointers (or offsets). These should
   probably be minimised anyway, but the code needs to deal with them properly
   regardless.
4. It had some rather dubious ad-hoc code to avoid calling
   emitThumbRegPlusImmediate, a function which should be designed to do precisely
   this job.

We seem to cover the common cases correctly now, and hopefully can enhance
emitThumbRegPlusImmediate to handle any extra optimisations we need to add in
future.

llvm-svn: 220236
2014-10-20 21:28:41 +00:00
Alexey Samsonov 798ff92c3c Be more specific about return type of MachOUniversalBinary::getObjectForArch
llvm-svn: 220230
2014-10-20 20:30:57 +00:00
Alexey Samsonov 4a7eb380cc Constify input argument of RelocVisitor and DWARFContext constructors. NFC.
llvm-svn: 220228
2014-10-20 20:28:51 +00:00
Robert Khasanov 65c2756869 Moved out IIT_V64 from common values section.
Thanks Juergen Ributzka for notice.

llvm-svn: 220224
2014-10-20 19:25:05 +00:00
Steven Wu d05affcc81 Fix Intrinsic::getType not working with vararg
VarArg Intrinsic functions are encoded with "void" type as the last
argument. Now Intrinsic::getType can correctly return all the intrinsic
function type.

llvm-svn: 220205
2014-10-20 15:47:24 +00:00
Oliver Stannard 6672f37ed2 [Thumb2] RFE, SRS and "SUBS pc, lr" are undefined on v7M
These instructions are related to the v7[AR] exception model, and are
not defined on v7M.

llvm-svn: 220204
2014-10-20 15:37:35 +00:00
Sid Manning c374ac97c8 Remove unnecessary else.
llvm-svn: 220200
2014-10-20 13:08:19 +00:00
Oliver Stannard e8f63a54b4 [ARM] Do not select SMULW[BT] or SMLAW[BT]
The current instruction selection patterns for SMULW[BT] and SMLAW[BT]
are incorrect. These instructions multiply a 32-bit and a 16-bit value
(both signed) and return the top 32 bits of the 48-bit result. This
preserves the 16 bits of overflow, whereas the patterns they currently
match truncate the result to 16 bits then sign extend.

To select these instructions, we would need to match an ISD::SMUL_LOHI,
a sign extend, two shifts and an or. There is no way to match SMUL_LOHI
in an instruction pattern as it defines multiple values, so this would
have to be done in C++. I have raised
http://llvm.org/bugs/show_bug.cgi?id=21297 to cover allowing correct
selection of these instructions.

This fixes http://llvm.org/bugs/show_bug.cgi?id=19396

llvm-svn: 220196
2014-10-20 11:30:35 +00:00
Oliver Stannard fce039240a [Thumb] Fix crash in Thumb1RegisterInfo::rewriteFrameIndex
This function can, for some offsets from the SP, split one instruction
into two. Since it re-uses the original instruction as the first
instruction of the result, we need ensure its result register is not
marked as dead before we use it in the second instruction.

llvm-svn: 220194
2014-10-20 11:00:18 +00:00
Chandler Carruth f67321cb26 Switch the default DataLayout to be little endian, and make the variable
be BigEndian so the default can continue to be zero-initialized.

This is one of the prerequisites to making DataLayout a constant and
always available part of every module.

llvm-svn: 220193
2014-10-20 10:41:29 +00:00
Chandler Carruth a32038b006 Fix a miscompile introduced in r220178.
The original code had an implicit assumption that if the test for
allocas or globals was reached, the two pointers were not equal. With my
changes to make the pointer analysis more powerful here, I also had to
guard against circumstances where the results weren't useful. That in
turn violated the assumption and gave rise to a circumstance in which we
could have a store with both the queried pointer and stored pointer
rooted at *the same* alloca. Clearly, we cannot ignore such a store.
There are other things we might do in this code to better handle the
case of both pointers ending up at the same alloca or global, but it
seems best to at least make the test explicit in what it intends to
check.

I've added tests for both the alloca and global case here.

llvm-svn: 220190
2014-10-20 10:03:01 +00:00
David Majnemer f3cadce84c IR: Replace DataLayout::RoundUpAlignment with RoundUpToAlignment
No functional change intended, just cleaning up some code.

llvm-svn: 220187
2014-10-20 06:13:33 +00:00
Chandler Carruth 6665d62117 Fix a somewhat subtle pair of issues with JumpThreading I introduced in
r220178. First, the creation routine doesn't insert prior to the
terminator of the basic block provided, but really at the end of the
basic block. Instead, get the terminator and insert before that. The
next issue was that we need to ensure multiple PHI node entries for
a single predecessor re-use the same cast instruction rather than
creating new ones.

All of the logic here was without tests previously. I've reduced and
added a test case from the test suite that crashed without both of these
fixes.

llvm-svn: 220186
2014-10-20 05:34:36 +00:00
Chandler Carruth eeec35ae1c Teach the load analysis driving core instcombine logic and other bits of
logic to look through pointer casts, making them trivially stronger in
the face of loads and stores with intervening pointer casts.

I've included a few test cases that demonstrate the kind of folding
instcombine can do without pointer casts and then variations which
obfuscate the logic through bitcasts. Without this patch, the variations
all fail to optimize fully.

This is more important now than it has been in the past as I've started
moving the load canonicialization to more closely follow the value type
requirements rather than the pointer type requirements and thus this
needs to be prepared for more pointer casts. When I made the same change
to stores several test cases regressed without logic along these lines
so I wanted to systematically improve matters first.

llvm-svn: 220178
2014-10-20 00:24:14 +00:00
Chandler Carruth bc6378defb Do a better and more complete job of preserving metadata when combining
loads.

This handles many more cases than just the AA metadata, some of them
suggested by Hal in his review of the AA metadata handling patch. I've
tried to test this behavior where tractable to do so.

I'll point out that I have specifically *not* included a test for
debuginfo because it was going to require 2 or 3 times as much work to
craft some input which would survive the "helpful" stripping of debug
info metadata that doesn't match the desired schema. This is another
good example of why the current state of write-ability for our debug
info metadata is unacceptable. I spent over 30 minutes trying to conjure
some test case that would survive, even copying from other debug info
tests, but it always failed to survive with no explanation of why or how
I might fix it. =[

llvm-svn: 220165
2014-10-19 10:46:46 +00:00
Chandler Carruth 5b8cd2f73c Move previously dead code to handle computing the known bits of an alias
up to where it actually works as intended. The problem is that
a GlobalAlias isa GlobalValue and so the prior block handled all of the
cases.

This allows us to constant fold based on the actual constant expression
in the global alias. As an example, see the last function in the newly
added test case which explicitly aligns an unaligned pointer using
constant expression math. Without this change, we fail to see that and
fold an alignment test to zero.

llvm-svn: 220164
2014-10-19 09:06:56 +00:00
David Majnemer 312c3e5f39 InstCombine: (sub (or A B) (xor A B)) --> (and A B)
The following implements the transformation:
(sub (or A B) (xor A B)) --> (and A B).

Patch by Ankur Garg!

Differential Revision: http://reviews.llvm.org/D5719

llvm-svn: 220163
2014-10-19 08:32:32 +00:00
David Majnemer 59939acd26 InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1
The following implements the optimization for sequences of the form:
icmp eq/ne (shl Const2, A), Const1

Such sequences can be transformed to:
icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2))

This handles only the equality operators for now. Other operators need
to be handled.

Patch by Ankur Garg!

llvm-svn: 220162
2014-10-19 08:23:08 +00:00
Chandler Carruth a801dd5799 Fix a long-standing miscompile in the load analysis that was uncovered
by my refactoring of this code.

The method isSafeToLoadUnconditionally assumes that the load will
proceed with the preferred type alignment. Given that, it has to ensure
that the alloca or global is at least that aligned. It has always done
this historically when a datalayout is present, but has never checked it
when the datalayout is absent. When I refactored the code in r220156,
I exposed this path when datalayout was present and that turned the
latent bug into a patent bug.

This fixes the issue by just removing the special case which allows
folding things without datalayout. This isn't worth the complexity of
trying to tease apart when it is or isn't safe without actually knowing
the preferred alignment.

llvm-svn: 220161
2014-10-19 08:17:50 +00:00
Chandler Carruth 8a99373812 Switch how the datalayout availability test is handled in this code to
make much more sense and in theory be more correct.

If you trace the code alllll the way back to when it was first
introduced, the comments make it slightly more clear what was going on
here. At that time, the only way Base != V was if DL (then TD) was
non-null. As a consequence, if DL *was* null, that meant we were loading
directly from the alloca or global found above the test. After
refactoring, this has become at least terribly subtle and potentially
incorrect. There are many forms of pointer manipulation that can be
traversed without DataLayout, and some of them would in fact change the
size of object being loaded vs. allocated.

Rather than this subtlety, I've hoisted the actual 'return true' bits
into the code which actually found an alloca or global and based them on
the loaded pointer being that alloca or global. This is both more clear
and safer. I've also added comments about exactly why this set of
predicates is used.

I've also corrected a misleading comment about globals -- if overridden
they may not just have a different size, they may be null and completely
unsafe to load from!

Hopefully this confuses the next reader a bit less. I don't have any
test cases or anything, the patch is motivated strictly to improve the
readability of the code.

llvm-svn: 220156
2014-10-19 00:42:16 +00:00
Bob Wilson 1e1f13862e Use triple predicate functions instead of checking values directly. NFC.
llvm-svn: 220155
2014-10-19 00:39:30 +00:00
Chandler Carruth 38e98d5782 Rename 'TD' to 'DL' in this function as the argument is now a DataLayout
argument.

llvm-svn: 220151
2014-10-18 23:47:22 +00:00
Chandler Carruth 1f27f03849 Fix the other comment to use modern doxygen style and be a bit more
direct. Notably, comment on the fact that the loaded type is significant
in that it determines how wide of an access must be safe.

llvm-svn: 220150
2014-10-18 23:46:17 +00:00
Chandler Carruth be49df3d2c More formatting cleanup brought to you by clang-format.
llvm-svn: 220149
2014-10-18 23:41:25 +00:00
Chandler Carruth b56052f44d Clean up doxygen syntax and reword comments to flow better, have a brief
section, and not have unfinished sentence fragments.

llvm-svn: 220147
2014-10-18 23:31:55 +00:00
Chandler Carruth d67244df4e Clean up the formatting and trailing whitespace of a routine before
editting it.

llvm-svn: 220146
2014-10-18 23:19:03 +00:00
Lang Hames ad0962aec5 [PBQP] Replace the interference-constraints algorithm with a faster version
loosely based on linear scan.

On x86-64 this is good for a ~2% drop in compile time on the nightly test suite.

llvm-svn: 220143
2014-10-18 17:26:07 +00:00
Chandler Carruth be9dccd64d Preserve AA metadata when combining (cast (load (...))) -> (load (cast
(...))).

llvm-svn: 220141
2014-10-18 11:00:12 +00:00
Chandler Carruth 2f75fcfef3 [InstCombine] Do an about-face on how LLVM canonicalizes (cast (load
...)) and (load (cast ...)): canonicalize toward the former.

Historically, we've tried to load using the type of the *pointer*, and
tried to match that type as closely as possible removing as many pointer
casts as we could and trading them for bitcasts of the loaded value.
This is deeply and fundamentally wrong.

Repeat after me: memory does not have a type! This was a hard lesson for
me to learn working on SROA.

There is only one thing that should actually drive the type used for
a pointer, and that is the type which we need to use to load from that
pointer. Matching up pointer types to the loaded value types is very
useful because it minimizes the physical size of the IR required for
no-op casts. Similarly, the only thing that should drive the type used
for a loaded value is *how that value is used*! Again, this minimizes
casts. And in fact, the *only* thing motivating types in any part of
LLVM's IR are the types used by the operations in the IR. We should
match them as closely as possible.

I've ended up removing some tests here as they were testing bugs or
behavior that is no longer present. Mostly though, this is just cleanup
to let the tests continue to function as intended.

The only fallout I've found so far from this change was SROA and I have
fixed it to not be impeded by the different type of load. If you find
more places where this change causes optimizations not to fire, those
too are likely bugs where we are assuming that the type of pointers is
"significant" for optimization purposes.

llvm-svn: 220138
2014-10-18 06:36:22 +00:00
Nick Kledzik 3b2aa057e6 [llvm-objdump] Fix mach-o binding decompression error
llvm-svn: 220119
2014-10-18 01:21:02 +00:00
Chandler Carruth 2dc9682e59 [SROA] Change how SROA does vector-based promotion of allocas to handle
cases where the alloca type, the load types, and the store types used
all disagree.

Previously, the only way that vector-based promotion occured was if the
alloca type was a vector type. This was one of the *very* few remaining
uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns
out it was a bad idea.

The alloca type can change very easily based on the mixture of types
loaded and stored to that alloca. We shouldn't be relying on it as
a signal for very much. Instead, the source of truth should be loads and
stores. We should canonicalize the loads and stores as much as possible
and then rely on them exclusively in SROA.

When looking and loads and stores, we may find many different candidate
vector types. This change will let SROA try all of them to find a vector
type which is a viable way to promote the entire alloca to a vector
register.

With this change, it becomes possible to do better canonicalization and
optimization of loads and stores without breaking SROA in random ways,
and that should allow fixing a core source of performance loss in hot
numerical loops such as those in Eigen.

llvm-svn: 220116
2014-10-18 00:44:02 +00:00
Aaron Watry 8114437a8f R600/SI: Add global atomicrmw xchg
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220110
2014-10-17 23:33:03 +00:00
Aaron Watry d672ee2a47 R600/SI: Add global atomicrmw xor
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220109
2014-10-17 23:33:01 +00:00
Aaron Watry 8a911e6926 R600/SI: Add global atomicrmw or
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220108
2014-10-17 23:32:59 +00:00
Aaron Watry 58c9992f15 R600/SI: Add global atomicrmw min/umin
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220107
2014-10-17 23:32:57 +00:00
Aaron Watry 29f295d7a5 R600/SI: Add global atomicrmw max/umax
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220106
2014-10-17 23:32:56 +00:00
Aaron Watry 621278034c R600/SI: Add global atomicrmw and
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220105
2014-10-17 23:32:54 +00:00
Aaron Watry 328f1bae8e R600/SI: Add global atomicrmw sub
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 220104
2014-10-17 23:32:52 +00:00
Evgeniy Stepanov e08633e900 [msan] Fix handling of byval arguments with large alignment.
MSan param-tls slots are 8-byte aligned. This change clips
alignment of memcpy into param-tls to 8.

llvm-svn: 220101
2014-10-17 23:29:44 +00:00
Pete Cooper 230332f4fe Check for dynamic alloca's when selecting lifetime intrinsics.
TL;DR: Indexing maps with [] creates missing entries.

The long version:

When selecting lifetime intrinsics, we index the *static* alloca map with the AllocaInst we find for that lifetime.  Trouble is, we don't first check to see if this is a dynamic alloca.

On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime.  0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots.

PEI would later trigger an assert because it expects the stack protector to not be dead.

This fix ensures that we only get frame indices for static allocas, ie, those in the map.  Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken.

rdar://problem/18672951

llvm-svn: 220099
2014-10-17 22:59:33 +00:00
Rafael Espindola 7da1ea83a9 Revert "TRE: make TRE a bit more aggressive"
This reverts commit r219899.

This also updates byval-tail-call.ll to make it clear what was breaking.
Adding r219899 again will cause the load/store to disappear.

llvm-svn: 220093
2014-10-17 21:25:48 +00:00
Bill Schmidt ba637db298 [PowerPC] Change assert to better form
llvm-svn: 220092
2014-10-17 21:19:59 +00:00
Matt Arsenault a708358e93 R600/SI: Remove redundant setting of instruction bits
These are all set on the instruction base classes.

llvm-svn: 220091
2014-10-17 21:13:11 +00:00
Bill Schmidt a087d74250 [PowerPC] Change liveness testing in VSX FMA mutation pass
With VSX enabled, LLVM crashes when compiling
test/CodeGen/PowerPC/fma.ll.  I traced this to the liveness test
that's revised in this patch. The interval test is designed to only
work for virtual registers, but in this case the AddendSrcReg is
physical. Since there is already a walk of the MIs between the
AddendMI and the FMA, I added a check for def/kill of the AddendSrcReg
in that loop.  At Hal Finkel's request, I converted the liveness test
to an assert restricted to virtual registers.

I've changed the fma.ll test to have VSX and non-VSX variants so we
can test both kinds of multiply-adds.

llvm-svn: 220090
2014-10-17 21:02:44 +00:00
Matt Arsenault 933c38df40 Fix typo
llvm-svn: 220068
2014-10-17 18:02:31 +00:00
Matt Arsenault e184482bf8 R600/SI: Also check for FPImm literal constants
llvm-svn: 220067
2014-10-17 18:00:50 +00:00
Matt Arsenault d282ada508 R600/SI: Allow commuting with source modifiers
llvm-svn: 220066
2014-10-17 18:00:48 +00:00
Matt Arsenault 8943d24949 R600/SI: Simplify code with hasModifiersSet
llvm-svn: 220065
2014-10-17 18:00:45 +00:00
Matt Arsenault ace5b76739 R600/SI: Fix general commuting breaking src mods
The generic code trying to use findCommutedOpIndices won't
understand that it needs to swap the modifier operands also,
so it should fail if they are set.

llvm-svn: 220064
2014-10-17 18:00:43 +00:00
Matt Arsenault ffc5d5bbf0 R600/SI: Cleanup code with ChangeToFPImmediate
llvm-svn: 220063
2014-10-17 18:00:41 +00:00
Matt Arsenault 6d3cd544bb R600/SI: Allow comuting fp immediates
llvm-svn: 220062
2014-10-17 18:00:39 +00:00
Matt Arsenault aa5ccfb566 R600/SI: Use early return instead of checking condition twice
Any commutable instruction will have at least src1.

llvm-svn: 220061
2014-10-17 18:00:37 +00:00
Matt Arsenault 328b1193b5 R600/SI: Use complex pattern for MUBUF load patterns.
This eliminates a use of the SI_ADDR64_RSRC pseudo

llvm-svn: 220057
2014-10-17 17:43:00 +00:00
Matt Arsenault 83a535ff6b R600/SI: Remove SI_BUFFER_RSRC pseudo
Just use REG_SEQUENCE directly, so there are fewer
instructions to need to deal with later.

llvm-svn: 220056
2014-10-17 17:42:56 +00:00
Juergen Ributzka ad2363f9ee [Stackmaps] Enable invoking the patchpoint intrinsic.
Patch by Kevin Modzelewski
Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits, reames

Differential Revision: http://reviews.llvm.org/D5634

llvm-svn: 220055
2014-10-17 17:39:00 +00:00
Andrea Di Biagio c48cb86f05 [X86] Fix missed selection of non-temporal store of zero vector.
When the input to a store instruction was a zero vector, the backend
always selected a normal vector store regardless of the non-temporal
hint. This is fixed by this patch.

This fixes PR19370.

llvm-svn: 220054
2014-10-17 17:27:06 +00:00
James Molloy f497d5511d [AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.

While we're at it, avoid writing to std::vector::end()...

Bug found with random testing and a lot of coffee.

llvm-svn: 220051
2014-10-17 17:06:31 +00:00
Bill Schmidt 2d1128acb2 [PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation
Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64
types, but does not yet use lxvw4x and stxvw4x for 4x32 types.  This
patch adds that support.

As with lxvd2x/stxvd2x, this involves straightforward overriding of
the patterns normally recognized for lvx/stvx, with preference given
to the VSX patterns when VSX is enabled.

In addition, the logic for permitting misaligned memory accesses is
modified so that v4r32 and v4i32 are treated the same as v2f64 and
v2i64 when VSX is enabled.  Finally, the DAG generation for unaligned
loads is changed to just use a normal LOAD (which will become lxvw4x)
on P8 and later hardware, where unaligned loads are preferred over
lvsl/lvx/lvx/vperm.

A number of tests now generate the VSX loads/stores instead of
lvx/stvx, so this patch adds VSX variants to those tests.  I've also
added <4 x float> tests to the vsx.ll test case, and created a
vsx-p8.ll test case to be used for testing code generation for the
P8Vector feature.  For now, that simply tests the unaligned load/store
behavior.

This has been tested along with a temporary patch to enable the VSX
and P8Vector features, with no new regressions encountered with or
without the temporary patch applied.

llvm-svn: 220047
2014-10-17 15:13:38 +00:00
Jan Vesely 54468a5a58 Mips: Only set divrem i64 to custom on 64bit
Reviewed-by: Daniel Sanders <daniel.sanders@imgtec.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 220046
2014-10-17 14:45:28 +00:00
Jan Vesely af62cf4db0 SelectionDAG: Add sext_inreg optimizations
v2: use dyn_cast
    fixup comments
v3: use cast

Reviewed-by: Matt Arsenault <arsenm2@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 220044
2014-10-17 14:45:25 +00:00
Vasileios Kalintiris 238692beb9 [mips] Add support for COP1's Branch-On-Cond-Likely instructions
Summary: Depends on D5782

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5802

llvm-svn: 220042
2014-10-17 14:08:28 +00:00
Vasileios Kalintiris 6d1e64896d [mips] Add support for COP0's Branch-On-Cond-Likely instructions
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5782

llvm-svn: 220036
2014-10-17 12:38:35 +00:00
Hal Finkel dd38c0b876 [DSE] Remove no-data-layout-only type-based overlap checking
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.

Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.

llvm-svn: 220035
2014-10-17 11:56:00 +00:00
Rafael Espindola b66130209b Add back commits r219835 and a fixed version of r219829.
The only difference from r219829 is using

getOrCreateSectionSymbol(*ELFSec)

instead of

GetOrCreateSymbol(ELFSec->getSectionName())

in ELFObjectWriter which causes us to use the correct section symbol even if
we have multiple sections with the same name.

Original messages:

r219829:
Correctly handle references to section symbols.

When processing assembly like

.long .text

we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.

This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.

The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.

r219835:
Allow forward references to section symbols.

llvm-svn: 220021
2014-10-17 01:48:58 +00:00
Akira Hatanaka 0d0c78180d ARM: Fix a bug which was causing convergence failure in constant-island pass.
The bug is in ARMConstantIslands::createNewWater where the upper bound of the
new water split point is computed:

// This could point off the end of the block if we've already got constant
// pool entries following this block; only the last one is in the water list.
// Back past any possible branches (allow for a conditional and a maximally
// long unconditional).
if (BaseInsertOffset + 8 >= UserBBI.postOffset()) {
  BaseInsertOffset = UserBBI.postOffset() - UPad - 8;
  DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset));
}

The split point is supposed to be somewhere between the machine instruction that
loads from the constant pool entry and the end of the basic block, before branch
instructions. The code above is fine if the basic block is large enough and
there are a sufficient number of instructions following the machine instruction.
However, if the machine instruction is near the end of the basic block,
BaseInsertOffset can point to the machine instruction or another instruction
that precedes it, and this can lead to convergence failure.

This commit fixes this bug by ensuring BaseInsertOffset is larger than the
offset of the instruction following the constant-loading instruction.

rdar://problem/18581150

llvm-svn: 220015
2014-10-17 01:31:47 +00:00
Rafael Espindola 4544a4062c Revert commit r219835 and r219829.
Revert "Correctly handle references to section symbols."
Revert "Allow forward references to section symbols."

Rui found a regression I am debugging.

llvm-svn: 220010
2014-10-17 01:06:02 +00:00
Peter Zotov aff492c6fd [LLVM-C] Add LLVMInstructionClone.
llvm-svn: 220007
2014-10-17 01:02:34 +00:00
Matt Arsenault bfaab76f6b R600/SI: Simplify debug printing
llvm-svn: 219999
2014-10-17 00:36:20 +00:00
Matt Arsenault 661a031af6 R600/SI: Remove another VALU pattern
llvm-svn: 219988
2014-10-16 23:33:37 +00:00
Peter Collingbourne e186319319 Introduce LLVMParseCommandLineOptions C API function.
llvm-svn: 219975
2014-10-16 22:47:52 +00:00
Juergen Ributzka fd4633e1a5 Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.
This is in preparation for another patch that makes patchpoints invokable.

Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5657

llvm-svn: 219967
2014-10-16 21:26:35 +00:00
Chandler Carruth 8393406f05 [SROA] Switch the common variable name for the 'AllocaSlices' class to
'AS'.

Using 'S' as this was a terrible idea. Arguably, 'AS' is not much
better, but it at least follows the idea of using initialisms and
removes active confusion about the AllocaSlices variable and a Slice
variable.

llvm-svn: 219963
2014-10-16 21:11:55 +00:00
Chandler Carruth 61747042c1 [SROA] More range-based cleanups to SROA, these brought to you by
clang-modernize.

I did have to clean up the variable types and whitespace a bit because
the use of auto made the code much less readable here.

llvm-svn: 219962
2014-10-16 21:05:14 +00:00
Chandler Carruth 57d4cae202 [SROA] Switch a couple of overly complex iterator accessors to just be
ArrayRef accessors.

I think this even came up in review that this was over-engineered, and
indeed it was. Time to un-build it.

llvm-svn: 219958
2014-10-16 20:42:08 +00:00
Robin Morisset e2de06bef6 Erase fence insertion from SelectionDAGBuilder.cpp (NFC)
Summary:
Backends can use setInsertFencesForAtomic to signal to the middle-end that
montonic is the only memory ordering they can accept for
stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger
ordering to fences + monotonic accesses is currently living in
SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it
for several reasons:
- There is lots of redundancy to avoid: extremely similar logic already
  exists in AtomicExpand.
- The current code in SelectionDAGBuilder does not use any target-hooks, it
  does the same transformation for every backend that requires it
- As a result it is plain *unsound*, as it was apparently designed for ARM.
  It happens to mostly work for the other targets because they are extremely
  conservative, but Power for example had to switch to AtomicExpand to be
  able to use lwsync safely (see r218331).
- Because it produces IR-level fences, it cannot be made sound ! This is noted
  in the C++11 standard (section 29.3, page 1140):
```
Fences cannot, in general, be used to restore sequential consistency for atomic
operations with weaker ordering semantics.
```
It can also be seen by the following example (called IRIW in the litterature):
```
atomic<int> x = y = 0;
int r1, r2, r3, r4;
Thread 0:
  x.store(1);
Thread 1:
  y.store(1);
Thread 2:
  r1 = x.load();
  r2 = y.load();
Thread 3:
  r3 = y.load();
  r4 = x.load();
```
r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst.
But if they are lowered to monotonic accesses, no amount of fences can prevent it..

This patch does three things (I could cut it into parts, but then some of them
would not be tested/testable, please tell me if you would prefer that):
- it provides a default implementation for emitLeadingFence/emitTrailingFence in
terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder.
As we saw above, this is unsound, but the best that can be done without knowing
the targets well (and there is a comment warning about this risk).
- it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default
implementation (that exactly replicates the logic of SelectionDAGBuilder, so no
functional change)
- it finally erase this logic from SelectionDAGBuilder as it is dead-code.

Ideally, each target would define its own override for emitLeading/TrailingFence
using target-specific fences, but I do not know the Sparc/Mips/XCore memory model
well enough to do this, and they appear to be dealing fine with the ARM-inspired
default expansion for now (probably because they are overly conservative, as
Power was). If anyone wants to compile fences more agressively on these
platforms, the long comment should make it clear why he should first override
emitLeading/TrailingFence.

Test Plan: make check-all, no functional change

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5474

llvm-svn: 219957
2014-10-16 20:34:57 +00:00
Matt Arsenault 70c82173f3 R600/SI: Remove unnecessary VALU patterns
These haven't been necessary since allowing
selecting SALU instructions in non-entry blocks
was enabled.

llvm-svn: 219956
2014-10-16 20:31:50 +00:00
Chandler Carruth c659df9389 [SROA] Start more deeply moving SROA to use ranges rather than just
iterators.

There are a ton of places where it essentially wants ranges
rather than just iterators. This is just the first step that adds the
core slice range typedefs and uses them in a couple of places. I still
have to explicitly construct them because they've not been punched
throughout the entire set of code. More range-based cleanups incoming.

llvm-svn: 219955
2014-10-16 20:24:07 +00:00
Matt Arsenault a3fe7c62d1 R600: Fix nonsensical implementation of computeKnownBits for BFE
This was resulting in invalid simplifications of sdiv

llvm-svn: 219953
2014-10-16 20:07:40 +00:00
Rafael Espindola 11aaaeebe0 Delete -std-compile-opts.
These days -std-compile-opts was just a silly alias for -O3.

llvm-svn: 219951
2014-10-16 20:00:02 +00:00
Bjorn Steinbrink d20816fde9 Allow call-slop optzn for destinations with a suitable dereferenceable attribute
Summary:
Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5832

llvm-svn: 219950
2014-10-16 19:43:08 +00:00
Sanjay Patel c699a6117b fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().

In the simplest case, this:

   y = sqrt(x * x);

becomes this:

   y = fabs(x);

This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.

Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.

Differential Revision: http://reviews.llvm.org/D5787

llvm-svn: 219944
2014-10-16 18:48:17 +00:00
Juergen Ributzka 03a0611061 [AArch64] Fix miscompile of sdiv-by-power-of-2.
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.

This fixes rdar://problem/18678801.

llvm-svn: 219934
2014-10-16 16:41:15 +00:00
Vasileios Kalintiris 167c372118 [mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 nodes.
Summary:
In order to support big endian targets for the BuildPairF64 nodes we
just need to swap the low/high pair registers. Additionally, for the
ExtractElementF64 nodes we have to calculate the correct stack offset
with respect to the node's register/operand that we want to extract.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5753

llvm-svn: 219931
2014-10-16 15:41:51 +00:00
Vasileios Kalintiris 711028f718 [mips] Marked the DI/EI instruction aliases as MIPS32r2
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5751

llvm-svn: 219927
2014-10-16 15:23:52 +00:00
Vasileios Kalintiris f445a56b61 Test commit access: remove extra new line at the end of file
llvm-svn: 219925
2014-10-16 14:37:00 +00:00
Akira Hatanaka 5c221ef98f Reapply r219832 - InstCombine: Narrow switch instructions using known bits.
The code committed in r219832 asserted when it attempted to shrink a switch
statement whose type was larger than 64-bit.

llvm-svn: 219902
2014-10-16 06:00:46 +00:00
Saleem Abdulrasool 7f52921976 TRE: make TRE a bit more aggressive
Make tail recursion elimination a bit more aggressive.  This allows us to get
tail recursion on functions that are just branches to a different function.  The
fact that the function takes a byval argument does not restrict it from being
optimised into just a tail call.

llvm-svn: 219899
2014-10-16 03:27:30 +00:00
Akira Hatanaka 40c2cf4afc Revert r219832.
llvm-svn: 219884
2014-10-16 01:17:02 +00:00
Hal Finkel 2400c96cc3 [LVI] Add some additional comments about caching and context instructions
Philip Reames and I had a long conversation about this, mostly because it is
not obvious why the current logic is correct. Hopefully, these comments will
prevent such confusion in the future.

llvm-svn: 219882
2014-10-16 00:40:05 +00:00
Matt Arsenault f1b34cf6b6 R600: Remove dead function
llvm-svn: 219879
2014-10-16 00:08:09 +00:00
Sanjoy Das 360b1ed5f2 Revert "r219834 - Teach ScalarEvolution to sharpen range information"
This change breaks the asan buildbots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468

llvm-svn: 219878
2014-10-15 23:46:04 +00:00
Hal Finkel 68dc3c7ab2 Preserve non-byval pointer alignment attributes using @llvm.assume when inlining
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.

llvm-svn: 219876
2014-10-15 23:44:41 +00:00
Hal Finkel 6f814db8d7 Add CreateAlignmentAssumption to IRBuilder
Clang CodeGen had a utility function for creating pointer alignment assumptions
using the @llvm.assume intrinsic. This functionality will also be needed by the
inliner (to preserve function-argument alignment attributes when inlining), so
this moves the utility function into IRBuilder where it can be used both by
Clang CodeGen and also other LLVM-level code.

llvm-svn: 219875
2014-10-15 23:44:22 +00:00
Adam Nemet 4285c1f8cc [AVX512] Add DQ subvector inserts
In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4
respectively.  These are matched by "Alt" Pat<>'s (Alt stands for alternative
VTs).

Since DQ has native support for these intructions, I peeled off the non-"Alt"
part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are
derived from this multiclass.  The "Alt" Pat<>'s are disabled with DQ.

Fixes <rdar://problem/18426089>

llvm-svn: 219874
2014-10-15 23:42:17 +00:00
Adam Nemet 449b3f0931 [AVX512] Two new attributes in X86VectorVTInfo for subvector insert
The new attributes are NumElts and the CD8TupleForm.  This prepares the code
to enable x8 and x2 inserts.

NFC, no change in X86.td.expanded except for the new attributes.

llvm-svn: 219871
2014-10-15 23:42:09 +00:00
Adam Nemet b1c3ef4b60 [AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size
It's the W bit that selects between 32 or 64 elt type and not the opcode.  The
opcode selects between the width of the insert (128 or 256).

llvm-svn: 219870
2014-10-15 23:42:04 +00:00
Matt Arsenault 20893b3611 R600: Remove unnecessary part of computeKnownBitsForTargetNode
Zero-width BFEs are combined away already, so there's no point in
handling them.

llvm-svn: 219868
2014-10-15 23:37:49 +00:00
Matt Arsenault 6de7af4242 Move variable down to use
llvm-svn: 219867
2014-10-15 23:37:42 +00:00
Alexander Potapenko 6909b5b567 Add MachOObjectFile::getUuid()
This CL introduces MachOObjectFile::getUuid(). This function returns an ArrayRef to the object file's UUID, or an empty ArrayRef if the object file doesn't contain an LC_UUID load command.
The new function is gonna be used by llvm-symbolizer.

llvm-svn: 219866
2014-10-15 23:35:45 +00:00
Chris Bieneman 5c4e9551c9 Fixing the build failure due to compiler warnings and unnecessary disambiguation.
llvm-svn: 219861
2014-10-15 23:11:35 +00:00
Chris Bieneman 732e0aa9fb Defining a new API for debug options that doesn't rely on static global cl::opts.
Summary:
This is based on the discussions from the LLVMDev thread:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075886.html

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5389

llvm-svn: 219854
2014-10-15 21:54:35 +00:00
Tom Stellard c8d7920ad9 R600/SI: Fix bug where immediates were being used in DS addr operands
The SelectDS1Addr1Offset complex pattern always tries to store constant
lds pointers in the offset operand and store a zero value in the addr operand.
Since the addr operand does not accept immediates, the zero value
needs to first be copied to a register.

This newly created zero value will not go through normal instruction
selection, so we need to manually insert a V_MOV_B32_e32 in the complex
pattern.

This bug was hidden by the fact that if there was another zero value
in the DAG that had not been selected yet, then the CSE done by the DAG
would use the unselected node for the addr operand rather than the one
that was just created.  This would lead to the zero value being selected
and the DAG automatically inserting a V_MOV_B32_e32 instruction.

llvm-svn: 219848
2014-10-15 21:08:59 +00:00
Eric Christopher 2181fb2ff3 Avoid caching the MachineFunction, we don't use it outside of
runOnMachineFunction.

llvm-svn: 219847
2014-10-15 21:06:25 +00:00
Sid Manning a002296427 Wrong attribute. LLVM_ATTRIBUTE_UNUSED not LLVM_ATTRIBUTE_USED
This original fix for the build break was correct.  LLVM_ATTRIBUTE_USED
removes the warning message because it keeps the function in the object
file.  LLVM_ATTRIBUTE_UNUSED indicates that it may or may not be used
depending on build settings.

llvm-svn: 219846
2014-10-15 20:41:17 +00:00
Duncan P. N. Exon Smith 8d5aeb2698 IR: Move NumOperands from User to Value, NFC
Store `User::NumOperands` (and `MDNode::NumOperands`) in `Value`.

On 64-bit host architectures, this reduces `sizeof(User)` and all
subclasses by 8, and has no effect on `sizeof(Value)` (or, incidentally,
on `sizeof(MDNode)`).

On 32-bit host architectures, this increases `sizeof(Value)` by 4.
However, it has no effect on `sizeof(User)` and `sizeof(MDNode)`, so the
only concrete subclasses of `Value` that actually see the increase are
`BasicBlock`, `Argument`, `InlineAsm`, and `MDString`.  Moreover, I'll
be shocked and confused if this causes a tangible memory regression.

This has no functionality change (other than memory footprint).

llvm-svn: 219845
2014-10-15 20:39:05 +00:00
Duncan P. N. Exon Smith fcece4d216 IR: Cleanup comments for Value, User, and MDNode
A follow-up commit will modify the memory-layout of `Value`, `User`, and
`MDNode`.  First fix the comments to be doxygen-friendly (and to follow
the coding standards).

  - Use "\brief" instead of "repeatedName -".
  - Add a brief intro where it was missing.
  - Remove duplicated comments from source files (and a couple of
    noisy/trivial comments altogether).

llvm-svn: 219844
2014-10-15 20:28:31 +00:00
Sid Manning 2ceaeb6baf Wrong attribute. LLVM_ATTRIBUTE_USED not LLVM_ATTRIBUTE_UNUSED
llvm-svn: 219837
2014-10-15 19:32:52 +00:00
Rafael Espindola 78206c3576 Allow forward references to section symbols.
llvm-svn: 219835
2014-10-15 19:30:18 +00:00
Sanjoy Das 90c2f1455a Teach ScalarEvolution to sharpen range information.
If x is known to have the range [a, b) in a loop predicated by (icmp
ne x, a), its range can be sharpened to [a + 1, b).  Get
ScalarEvolution and hence IndVars to exploit this fact.
    
This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.

phabricator: http://reviews.llvm.org/D5639
llvm-svn: 219834
2014-10-15 19:25:28 +00:00
Sid Manning 74cd020fca Add LLVM_ATTRIBUTE_UNUSED to function currently just used in an assert
Fixes break when -Wunused-function is used.

llvm-svn: 219833
2014-10-15 19:24:14 +00:00
Akira Hatanaka 5bb9346a45 InstCombine: Narrow switch instructions using known bits.
Truncate the operands of a switch instruction to a narrower type if the upper
bits are known to be all ones or zeros.

rdar://problem/17720004

llvm-svn: 219832
2014-10-15 19:05:50 +00:00
Juergen Ributzka f82c987a5c Reapply "[FastISel][AArch64] Add custom lowering for GEPs."
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.

The original commit had a bug in the add emit logic, which has been fixed.

llvm-svn: 219831
2014-10-15 18:58:07 +00:00
Juergen Ributzka 6780f0f7a0 [FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.

llvm-svn: 219830
2014-10-15 18:58:02 +00:00
Rafael Espindola a74b5e6823 Correctly handle references to section symbols.
When processing assembly like

.long .text

we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.

This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.

The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.

llvm-svn: 219829
2014-10-15 18:55:30 +00:00
Sid Manning 12cd21aacd Enable the instruction printer in HexagonMCTargetDesc
This adds the MCInstPrinter to the LLVMHexagonDesc library and removes
the dependency LLVMHexagonAsmPrinter had on LLVMHexagonDesc. This is
a prerequisite needed by the disassembler.

Phabricator Revision: http://reviews.llvm.org/D5734

llvm-svn: 219826
2014-10-15 18:27:40 +00:00
Matt Arsenault 1a74aff846 R600/SI: Also try to use 0 base for misaligned 8-byte DS loads.
llvm-svn: 219823
2014-10-15 18:06:43 +00:00
Matt Arsenault 7b68fdf3c0 R600: Fix miscompiles when BFE has multiple uses
SimplifyDemandedBits would break the other uses of the operand.

llvm-svn: 219819
2014-10-15 17:58:34 +00:00
Sanjay Patel c00017d1f6 correct const-ness with auto and dyn_cast
1. Use const with autos.
2. Don't bother with explicit const in cast ops because they do it automagically.

Thanks, David B. / Aaron B. / Reid K.

llvm-svn: 219817
2014-10-15 17:45:13 +00:00
Hal Finkel 3b7fc86677 [SLPVectorize] Basic ephemeral-value awareness
The SLP vectorizer should not vectorize ephemeral values. These are used to
express information to the optimizer, and vectorizing them does not lead to
faster code (because the ephemeral values are dropped prior to code generation,
vectorized or not), and obscures the information the instructions are
attempting to communicate (the logic that interprets the arguments to
@llvm.assume generically does not understand vectorized conditions).

Also, uses by ephemeral values are free (because they, and the necessary
extractelement instructions, will be dropped prior to code generation).

llvm-svn: 219816
2014-10-15 17:35:01 +00:00
Hal Finkel 8683d2b0d2 Treat the WorkSet used to find ephemeral values as double-ended
We need to make sure that we visit all operands of an instruction before moving
deeper in the operand graph. We had been pushing operands onto the back of the work
set, and popping them off the back as well, meaning that we might visit an
instruction before visiting all of its uses that sit in between it and the call
to @llvm.assume.

To provide an explicit example, given the following:
  %q0 = extractelement <4 x float> %rd, i32 0
  %q1 = extractelement <4 x float> %rd, i32 1
  %q2 = extractelement <4 x float> %rd, i32 2
  %q3 = extractelement <4 x float> %rd, i32 3
  %q4 = fadd float %q0, %q1
  %q5 = fadd float %q2, %q3
  %q6 = fadd float %q4, %q5
  %qi = fcmp olt float %q6, %q5
  call void @llvm.assume(i1 %qi)

%q5 is used by both %qi and %q6. When we visit %qi, it will be marked as
ephemeral, and we'll queue %q6 and %q5. %q6 will be marked as ephemeral and
we'll queue %q4 and %q5. Under the old system, we'd then visit %q4, which
would become ephemeral, %q1 and then %q0, which would become ephemeral as
well, and now we have a problem. We'd visit %rd, but it would not be marked as
ephemeral because we've not yet visited %q2 and %q3 (because we've not yet
visited %q5).

This will be covered by a test case in a follow-up commit that enables
ephemeral-value awareness in the SLP vectorizer.

llvm-svn: 219815
2014-10-15 17:34:48 +00:00
Derek Schuff 05fb735f3a [MC] Make bundle alignment mode setting idempotent and support nested bundles
Summary:
Currently an error is thrown if bundle alignment mode is set more than once
per module (either via the API or the .bundle_align_mode directive). This
change allows setting it multiple times as long as the alignment doesn't
change.

Also nested bundle_lock groups are currently not allowed. This change allows
them, with the effect that the group stays open until all nests are exited,
and if any of the bundle_lock directives has the align_to_end flag, the
group becomes align_to_end.

These changes make the bundle aligment simpler to use in the compiler, and
also better match the corresponding support in GNU as.

Reviewers: jvoung, eliben

Differential Revision: http://reviews.llvm.org/D5801

llvm-svn: 219811
2014-10-15 17:10:04 +00:00
Duncan P. N. Exon Smith 7f637a9b48 DI: Make comments "brief"-er, NFC
Follow-up to r219801.  Post-commit review pointed out that all comments
require a `\brief` description [1], so I converted many and recrafted a
few to be briefer or to include a brief intro.  (If I'm going to clean
them up, I should do it right!)

[1]: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

llvm-svn: 219808
2014-10-15 17:01:28 +00:00
Sanjay Patel 473e7fdb08 Use 'auto' for easier reading; no functional change intended.
llvm-svn: 219804
2014-10-15 16:21:37 +00:00
Duncan P. N. Exon Smith d79c4fd595 DI: Cleanup comments, NFC
A number of comment cleanups:

  - Remove duplicated function and class names from comments.

  - Remove duplicated comments from source file (some of which were
    out-of-sync).

  - Move any unduplicated comments from source file to header.

  - Remove some noisy comments entirely (e.g., a comment for
    `DIDescriptor::print()` saying "print descriptor" just gets in the
    way of reading the code).

llvm-svn: 219801
2014-10-15 16:15:15 +00:00
Rafael Espindola 7b61ddfa6e Simplify handling of --noexecstack by using getNonexecutableStackSection.
llvm-svn: 219799
2014-10-15 16:12:52 +00:00
Duncan P. N. Exon Smith 3bfffde27a DI: Use a `DenseMap` instead of named metadata, NFC
Remove a strange round-trip through named metadata to assign preserved
local variables to their subprograms.

llvm-svn: 219798
2014-10-15 16:11:41 +00:00
Rafael Espindola ad33dd2914 Move getNonexecutableStackSection up to the base ELF class.
The .note.GNU-stack section is not SystemZ/X86 specific.

llvm-svn: 219796
2014-10-15 15:44:16 +00:00
Matt Arsenault f179420c57 R600: Use existing variable
llvm-svn: 219778
2014-10-15 05:07:00 +00:00
Matt Arsenault 7acfddf17c R600: Remove outdated comment
llvm-svn: 219777
2014-10-15 05:06:57 +00:00
Juergen Ributzka 42379d4cf7 Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

llvm-svn: 219776
2014-10-15 04:55:48 +00:00
Jingyue Wu 2954280f6a [MachineSink] Use the real post dominator tree
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in
isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo.
The old heuristics caused instructions to sink unnecessarily, and might create
register pressure.

This is the second try of the fix. The first one (D4814) caused a performance
regression due to failing to sink instructions out of loops (PR21115). This
patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower
one regardless of whether the target block post-dominates the source.

Thanks Alexey Volkov for reporting PR21115! 

Test Plan:
Added a NVPTX codegen test to verify that our change prevents the backend from
over-sinking. It also shows the unnecessary register pressure caused by
over-sinking.

Added an X86 test to verify we can sink instructions out of loops regardless of
the dominance relationship. This test is reduced from Alexey's test in PR21115.

Updated an affected test in X86.

Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime
performance. Results are attached separately in the review thread.

Reviewers: Jiangning, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5633

llvm-svn: 219773
2014-10-15 03:27:43 +00:00
Tim Northover e9ff4c29b9 ARM: drop check for triple that's no longer used.
Early attempts to support AAPCS bare metal MachO targets based the decision on
the CPU being compiled for. This was not a particularly great idea and we've
got a better option now, but this check remained.

No functional change for any target we care about.

llvm-svn: 219767
2014-10-15 01:05:01 +00:00