Commit Graph

32982 Commits

Author SHA1 Message Date
Dan Gohman c17e140b39 [WebAssembly] Change int_wasm_memory_size from IntrNoMem to IntrReadMem.
llvm-svn: 253147
2015-11-14 23:02:31 +00:00
Simon Pilgrim 0de179b23b [X86][SSE] Fixed arch/triple and regenerated results.
Tidyup before diffs from new patch.

llvm-svn: 253144
2015-11-14 20:42:01 +00:00
Simon Pilgrim 96d34d34b0 [X86][SSE] Added extra vector truncation tests
Baseline comparison to D14588

llvm-svn: 253132
2015-11-14 15:23:59 +00:00
Michael Zolotukhin 8ef44f93ca Don't recompute LCSSA after loop-unrolling when possible.
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.

I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.

Soon I plan to follow up with a similar patch for recalculation of dominators
tree.

Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14526

llvm-svn: 253126
2015-11-14 05:51:41 +00:00
Quentin Colombet 2cdcfd23cd [ShrinkWrapping] Disable the optimization for functions with sanitize like
attribute.

Even if the target supports shrink-wrapping, the prologue and epilogue
must not move because a crash can happen anywhere and sanitizers need
to be able to unwind from the PC of the crash.

llvm-svn: 253116
2015-11-14 01:55:17 +00:00
Chad Rosier cc299b627d [LIR] Add support for creating memcpys from loops with a negative stride.
This allows us to transform the below loop into a memcpy.

void test(unsigned *__restrict__ a, unsigned *__restrict__ b) {
  for (int i = 2047; i >= 0; --i) {
    a[i] = b[i];
  }
}

This is the memcpy version of r251518, which added support for memset with
negative strided loops.

llvm-svn: 253091
2015-11-13 21:51:02 +00:00
Reid Kleckner 75b4be9a11 [WinEH] Fix ESP management with 32-bit __CxxFrameHandler3
The C++ EH personality automatically restores ESP from the C++ EH
registration node after a catchret. I mistakenly thought it was like
SEH, which does not restore ESP.

It makes sense for C++ EH to differ from SEH here because SEH does not
use funclets for catches, and does not allow catching inside of finally.
C++ EH may need to unwind through multiple catch funclets and eventually
catchret to some outer funclet. Therefore, the runtime has to keep track
of which ESP to use with catchret, rather than having the compiler
reload it manually.

llvm-svn: 253084
2015-11-13 21:27:00 +00:00
Evgeniy Stepanov 447bbdb171 [safestack] Rewrite isAllocaSafe using SCEV.
Use ScalarEvolution to calculate memory access bounds.
Handle function calls based on readnone/nocapture attributes.
Handle memory intrinsics with constant size.

This change improves both recall and precision of IsAllocaSafe.
See the new tests (ex. BitCastWide) for the kind of code that was wrongly
classified as safe.

SCEV efficiency seems to be limited by the fact the SafeStack runs late
(in CodeGenPrepare), and many loops are unrolled or otherwise not in LCSSA.

llvm-svn: 253083
2015-11-13 21:21:42 +00:00
Diego Novillo 8e415a821f SamplePGO - Add dump routines for LineLocation, SampleRecord and FunctionSamples
llvm-svn: 253071
2015-11-13 20:24:28 +00:00
Cong Hou ef4074bac2 [X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32.
This patch is enabling combining UNPCKL with vector_shuffle that moves the upper
half of a vector into the lower half, into a UNPCKH instruction. For example:

t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8
t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2

will be combined to:

t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1


Differential revision: http://reviews.llvm.org/D14399

llvm-svn: 253067
2015-11-13 19:47:43 +00:00
David Blaikie 8e8dd57e0b dwarfdump: Add support for dumping the table contents of DWP indexes
This is a recommit of 252842 which was reverted in 252859. The issue was
using %s format specifier for a StringRef - used Format's
left_justify(StringRef, int) instead.

It'd be nice to have __attribute__((format(..))) on llvm::format, but
apparently it's only implemented for c-style variadics, not C++ variadic
templates. Perhaps we could fix that & conditionalize the attribute on
such...

llvm-svn: 253065
2015-11-13 19:18:49 +00:00
Reid Kleckner 82a6d4bf5c Add missing triple to WinEH test case
llvm-svn: 253062
2015-11-13 19:11:12 +00:00
Reid Kleckner 94b57065c6 [WinEH] Make UnwindHelp a fixed stack object allocated after XMM CSRs
Now the offset of UnwindHelp in our EH tables and the offset that we
store to in the prologue agree.

llvm-svn: 253059
2015-11-13 19:06:01 +00:00
Tom Stellard f9f5f12ce7 ELFYAML: Add support for parsing AMDGPU section attribute flags
Reviewers: silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14444

llvm-svn: 253052
2015-11-13 17:06:29 +00:00
James Molloy b564098c62 [ARM] Replace ARMISD::RBIT with ISD::BITREVERSE
ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering.

llvm-svn: 253047
2015-11-13 16:05:22 +00:00
Zlatko Buljan 32fb5c40d2 [mips][microMIPS] Implement SHRA[_R].PH, SHRAV[_R].PH, SHRAV[_R].QB, SHRAV_R.W, SHRA_R.W, SHRL.PH, SHRL.QB, SHRLV.PH and SHRLV.QB instructions
Differential Revision: http://reviews.llvm.org/D14010

llvm-svn: 253041
2015-11-13 13:14:25 +00:00
Daniel Sanders dd0eb2bbdd [mips][ias] Explicitly disable IAS on asm-large-immediate.ll.
NFC at the moment but it will prevent a failure when IAS is enabled by default.

llvm-svn: 253039
2015-11-13 13:02:31 +00:00
Daniel Sanders 05d81d8286 [mips][ias] Replace invalid assembly insn in test since IAS parses inline assembly.
This is NFC at the moment but will prevent this test from failing when
IAS is the default.

llvm-svn: 253033
2015-11-13 11:44:00 +00:00
James Molloy 67ca6edbb1 [AArch64] Check the expansion of BITREVERSE in regression test
Something I missed from Hal's review, rightly pointed out by Ben Kramer - we should make sure the expansion is properly checked as it can be easy for bugs to creep in.

I've checked the scalar i8 expansion here and the vector i8 expansion in a previous commit.

llvm-svn: 253024
2015-11-13 10:05:31 +00:00
James Molloy bb1dbf530a [SDAG] Fix expansion of BITREVERSE
Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash.

What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size.

Use an APInt for the shift and VT.getScalarSizeInBits().

llvm-svn: 253023
2015-11-13 10:02:36 +00:00
NAKAMURA Takumi 7706fe58d2 llvm/test/tools/llvm-profdata/text-format-errors.test: Use prepared version of the input file, instead of using echo.
...and s/\C9/\xC9/

llvm-svn: 253014
2015-11-13 06:06:58 +00:00
Nathan Slingerland 4f82366759 [llvm-profdata] Add check for text profile formats and improve error reporting (2nd try)
Summary:
This change addresses two possible instances of user error / confusion when
merging sampled profile data.

Previously any input that didn't match the raw or processed instrumented format
would automatically be interpreted as instrumented profile text format data.
No error would be reported during the merge.

Example:
If foo-sampled.profdata and bar-sampled.profdata are binary sampled profiles:

Old behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -output foobar-sampled.profdata
$ llvm-profdata show -sample foobar-sampled.profdata
error: foobar-sampled.profdata:1: Expected 'mangled_name:NUM:NUM', found  lprofi

This change adds basic checks for valid input data when assuming text input.
It also makes error messages related to file format validity more specific about
the assumbed profile data type.

New behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -o foobar-sampled.profdata
error: foo.profdata: Unrecognized instrumentation profile encoding format
Perhaps you forgot to use the -sample option?

Reviewers: bogner, davidxl, dnovillo

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14558

llvm-svn: 253009
2015-11-13 03:47:58 +00:00
Colin LeMahieu fa5558307b [Hexagon] NFC. Adding a number of packet correctness tests.
llvm-svn: 253000
2015-11-13 01:46:06 +00:00
Dan Gohman f19ed56288 [WebAssembly] Inline asm support.
llvm-svn: 252997
2015-11-13 01:42:29 +00:00
Colin LeMahieu 8bb168b160 [Hexagon] Adding relaxation functionality to backend and test.
llvm-svn: 252989
2015-11-13 01:12:25 +00:00
Joseph Tremoulet 149c433bcc [WinEH] Find root frame correctly in CLR funclets
Summary:
The value that the CoreCLR personality passes to a funclet for the
establisher frame may be the root function's frame or may be the parent
funclet's (mostly empty) frame in the case of nested funclets.  Each
funclet stores a pointer to the root frame in its own (mostly empty)
frame, as does the root function itself.  All frames allocate this slot at
the same offset, measured from the post-prolog stack pointer, so that the
same sequence can accept any ancestor as an establisher frame parameter
value, and so that a single offset can be reported to the GC, which also
looks at this slot.

This change allocate the slot when processing function entry, and records
its frame index on the WinEHFuncInfo object, then inserts the code to
set/copy it during prolog emission.


Reviewers: majnemer, AndyAyers, pgavlin, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14614

llvm-svn: 252983
2015-11-13 00:39:23 +00:00
Dan Gohman 058fce5435 [WebAssembly] Introduce a new pseudo-operand for unused expression results.
llvm-svn: 252975
2015-11-13 00:21:05 +00:00
Vyacheslav Klochkov cbc56baae6 X86-FMA3: Implemented commute transformations FMA*_Int instructions.
It made it possible to apply the memory folding optimization for the 2nd
operand of FMA*_Int instructions.

Reviewer: Quentin Colombet
Differential Revision: http://reviews.llvm.org/D14550

llvm-svn: 252973
2015-11-13 00:07:35 +00:00
Colin LeMahieu 7a92c6ecbb [Hexagon] Adding checks for values out of operand range and correct new-value producer usage.
llvm-svn: 252969
2015-11-12 23:28:01 +00:00
Colin LeMahieu a1fa71ead9 [Hexagon] Adding test to make sure labels and register pairs are correctly parsed.
llvm-svn: 252968
2015-11-12 22:54:14 +00:00
Sanjay Patel fbaf5a9534 specify triple and tighten checks using update_llc_test_checks.py
llvm-svn: 252962
2015-11-12 22:27:38 +00:00
Tom Stellard 0967c91e0c Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Mike Aizatsky ba8a5b1f91 disabling sancov tests: too many failures on different platforms.
Differential Revision: http://reviews.llvm.org/D14624

llvm-svn: 252945
2015-11-12 20:47:12 +00:00
Mike Aizatsky 67e3d651f5 sancov tests - platform independent separators
llvm-svn: 252943
2015-11-12 20:17:49 +00:00
Tobias Grosser 8241795d20 Revert "Fix bug 25440: GVN assertion after coercing loads"
This reverts 252919 which broke LNT: MultiSource/Applications/SPASS

llvm-svn: 252936
2015-11-12 20:04:21 +00:00
Mike Aizatsky 14a06ac05e sancov test suite
Differential Revision: http://reviews.llvm.org/D14589

llvm-svn: 252933
2015-11-12 19:34:21 +00:00
Teresa Johnson ba5d68dfff [ThinLTO] Update test to be more tolerant of ordering changes
Update the ThinLTO function importing test to use DAG forms of checks so
that it is more tolerant of changes to relative ordering between
imported decls/defs. This reduces the number of changes required by the
comdat importing patch I am sending for review shortly.

llvm-svn: 252932
2015-11-12 19:31:46 +00:00
Nathan Slingerland 911ced6bf3 reverting r252916 to investigate test failure
llvm-svn: 252921
2015-11-12 18:39:26 +00:00
Weiming Zhao eed0145dd2 Fix bug 25440: GVN assertion after coercing loads
Summary:
when coercing loads, it inserts some instructions, which have no GV assigned.

https://llvm.org/bugs/show_bug.cgi?id=25440


Reviewers: hfinkel, dberlin

Subscribers: dberlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14479

llvm-svn: 252919
2015-11-12 18:19:59 +00:00
Quentin Colombet 94dc1e0d34 [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.
ShrinkWrapping does not understand exception handling constraints for now, so
make sure we do not mess with them by aborting on functions that use EH
funclets.

llvm-svn: 252917
2015-11-12 18:13:42 +00:00
Nathan Slingerland f0e107e38a [llvm-profdata] Add check for text profile formats and improve error reporting
Summary:
This change addresses two possible instances of user error / confusion when
merging sampled profile data.

Previously any input that didn't match the raw or processed instrumented format
would automatically be interpreted as instrumented profile text format data.
No error would be reported during the merge.

Example:
If foo-sampled.profdata and bar-sampled.profdata are binary sampled profiles:

Old behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -output foobar-sampled.profdata
$ llvm-profdata show -sample foobar-sampled.profdata
error: foobar-sampled.profdata:1: Expected 'mangled_name:NUM:NUM', found  lprofi

This change adds basic checks for valid input data when assuming text input.
It also makes error messages related to file format validity more specific about
the assumbed profile data type.

New behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -o foobar-sampled.profdata
error: foo.profdata: Unrecognized instrumentation profile encoding format
Perhaps you forgot to use the -sample option?

Reviewers: bogner, davidxl, dnovillo

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14558

llvm-svn: 252916
2015-11-12 18:06:18 +00:00
Dan Gohman cf4748f180 [WebAssembly] Reapply r252858, with svn add for the new file.
Switch to MC for instruction printing.

This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
does not use sigils on label names, so those are no longer present, and
push/pop now have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252910
2015-11-12 17:04:33 +00:00
Michael Zuckerman fd3fe9e45a [x86] translating "fp" (floating point) instructions from {fadd,fdiv,fmul,fsub,fsubr,fdivr} to {faddp,fdivp,fmulp,fsubp,fsubrp,fdivrp}
LLVM Missing the following instructions: fadd\fdiv\fmul\fsub\fsubr\fdivr.
GAS and MS supporting this instruction and lowering them in to a faddp\fdivp\fmulp\fsubp\fsubrp\fdivrp instructions.

Differential Revision: http://reviews.llvm.org/D14217

llvm-svn: 252908
2015-11-12 16:58:51 +00:00
Hans Wennborg 7384a2de02 Revert r252858: "[WebAssembly] Switch to MC for instruction printing."
It broke the CMake build:

"Cannot find source file: WebAssemblyRegNumbering.cpp"

llvm-svn: 252897
2015-11-12 14:37:56 +00:00
Vasileios Kalintiris 48e0256ed6 Re-apply "[mips] Use correct frame register for DWARF info when dynamically realigning the stack.""
r252219 reversed the direction of subprogram -> function edge. Fixed the
IR to account for this.

llvm-svn: 252895
2015-11-12 14:11:43 +00:00
James Molloy 8e99e97f2a [ARM] CMOV->BFI combining: handle both senses of CMPZ
I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was.

If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases.

llvm-svn: 252891
2015-11-12 13:49:17 +00:00
Renato Golin 93064025bd Revert "[ARM] Enable shrink-wrapping by default."
This reverts commit r252825, as it broke ASAN on ARM. Investigating...

llvm-svn: 252889
2015-11-12 13:34:50 +00:00
Daniel Sanders 9f6ad49740 Implement .reloc (constant offset only) with support for R_MIPS_NONE and R_MIPS_32.
Summary:
Support for R_MIPS_NONE allows us to parse MIPS16's usage of .reloc.
R_MIPS_32 was included to be able to better test the directive.

Targets can add their relocations by overriding MCAsmBackend::getFixupKind().

Subscribers: grosbach, rafael, majnemer, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D13659

llvm-svn: 252888
2015-11-12 13:33:00 +00:00
Zlatko Buljan 797c2aec6b [mips][microMIPS] Implement LWM16, SB16, SH16, SW16, SWSP and SWM16 instructions
Differential Revision: http://reviews.llvm.org/D11406

llvm-svn: 252885
2015-11-12 13:21:33 +00:00
Vasileios Kalintiris d38860610d Revert "[mips] Use correct frame register for DWARF info when dynamically realigning the stack."
This reverts commit r252882. LLParser complains for invalid field 'function'
in DISubprogram.

llvm-svn: 252884
2015-11-12 13:19:11 +00:00
Vasileios Kalintiris 352eb55baf [mips] Use correct frame register for DWARF info when dynamically realigning the stack.
Summary:
This patch overrides TargetFrameLowering::getFrameIndexReference() in order to
specify the correct register when the function needs dynamic stack realignment.
The values returned from this function are used in order to create DW_AT_locations
for DWARF info. These locations would use the wrong registers as it's been
reported in PR25028.

Reviewers: dsanders

Subscribers: dean, llvm-commits

Differential Revision: http://reviews.llvm.org/D13511

llvm-svn: 252882
2015-11-12 13:04:16 +00:00
James Molloy 2d09c00b91 [InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> x
There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review.

llvm-svn: 252879
2015-11-12 12:39:41 +00:00
James Molloy 90111f79f9 [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target.

This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support.

The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently).

llvm-svn: 252878
2015-11-12 12:29:09 +00:00
James Molloy 7e9bdd5d01 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
Kuba Brecka de8332257b [Object, MachO] Mark symbols from DATA and BSS sections as ST_Data
In `MachOObjectFile::getSymbolType` we currently always return `SymbolRef::ST_Function` for symbols from any section. In order for llvm-symbolizer to correctly symbolize Mach-O globals, symbols from data and BSS sections should return `SymbolRef::ST_Data`.

Differential Revision: http://reviews.llvm.org/D14576

llvm-svn: 252867
2015-11-12 09:40:29 +00:00
Amjad Aboud e59cc3e540 dwarfdump: Added macro support to llvm-dwarfdump tool.
Added "macro" option to "-debug-dump" flag, which trigger parsing and dumping of the ".debug_macinfo" section.

Differential Revision: http://reviews.llvm.org/D14294

llvm-svn: 252866
2015-11-12 09:38:54 +00:00
James Molloy 9a32da74f7 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy b14994e752 [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
David Blaikie 6400fc146e Mostly revert 252842 due to failures on some buildbots.
I imagine there's some UB in here somewhere, though Valgrind doesn't
seem to have picked it up (not sure if I have a working asan build right
now to test there).

GDB bot seems to be crashing:
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/26267/steps/check-all/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dwp.test

Hexagon ELF bot is, presumably, just getting different output:
http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/32927/steps/check-all/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dwp.test

llvm-svn: 252859
2015-11-12 06:33:14 +00:00
Dan Gohman 9dd55a8065 [WebAssembly] Switch to MC for instruction printing.
This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
use sigils on label names, so those are no longer present, and push/pop now
have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252858
2015-11-12 06:10:03 +00:00
David Blaikie 5b9bf49c6f dwarfdump: Dump the contents of DWP indexes
llvm-svn: 252842
2015-11-12 01:41:52 +00:00
Matthias Braun b9610a6bc2 LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

Differential Revision: http://reviews.llvm.org/D11172

llvm-svn: 252839
2015-11-12 01:02:47 +00:00
Manman Ren 3f2b9c18e2 [TLS on Darwin] use a different mask for tls calls on x86-64.
Calls involved in thread-local variable lookup save more registers
than normal calls.

rdar://problem/23073171

llvm-svn: 252837
2015-11-12 00:54:04 +00:00
Quentin Colombet 10f9813528 [ARM] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14357

rdar://problem/21942589

llvm-svn: 252825
2015-11-11 23:31:46 +00:00
Reid Kleckner b9204a584c [WinEH] Don't forward branches across empty EH pad BBs
For really simple SEH catchpads, we tried to forward the invoke unwind
edge across the empty block.

llvm-svn: 252822
2015-11-11 23:09:31 +00:00
David Majnemer f0f224d12d [IR] Add support for empty tokens
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token.  Currently, we have no
mechanism to represent an initial token state.

Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like.  This new constant
is called ConstantTokenNone and is written textually as "token none".

Differential Revision: http://reviews.llvm.org/D14581

llvm-svn: 252811
2015-11-11 21:57:16 +00:00
Sanjoy Das cdafd8490a Introduce deoptimization operand bundles
Summary:
This change introduces the notion of "deoptimization" operand bundles.
LLVM can recognize and optimize these in more precise ways than it can a
generic "unknown" operand bundles.

The current form of this special recognition / optimization is an enum
entry in LLVMContext, a LangRef blurb and a verifier rule.  Over time we
will teach LLVM to do more aggressive optimization around deoptimization
operand bundles, exploiting known facts about kinds of state
deoptimization operand bundles are allowed to track.

Reviewers: reames, majnemer, chandlerc, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14551

llvm-svn: 252806
2015-11-11 21:38:02 +00:00
Hemant Kulkarni bdce12a01b [Symbolizer]: Add -pretty-print option
Differential Revision: http://reviews.llvm.org/D13671

llvm-svn: 252798
2015-11-11 20:41:43 +00:00
Yunzhong Gao ea7b3a2320 Add a libLTO diagnostic handler that supports lto_get_error_message API
This is a follow-up from the previous discussion on the thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151019/307763.html

The LibLTO lto_get_error_message() API reads error messages from a std::string
sLastErrorString. Instead of passing this string around as an argument, this
patch creates a diagnostic handler and then sends this handler to the
constructor of LTOCodeGenerator.

Differential Revision: http://reviews.llvm.org/D14313

llvm-svn: 252791
2015-11-11 19:59:08 +00:00
Geoff Berry 2ddfc5e60f [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

llvm-svn: 252789
2015-11-11 19:42:52 +00:00
David Blaikie 51c402838c dwarfdump: DWP type unit index dumping skeleton
llvm-svn: 252786
2015-11-11 19:40:49 +00:00
David Blaikie 65a8efe441 dwarfdump: First piece of support for DWP dumping
Just a tiny piece of index dumping - the header in this instance.

llvm-svn: 252781
2015-11-11 19:28:21 +00:00
Joseph Tremoulet 9f467353a5 [WinEH] Only generate UnwindHelp slot for MSVCXX
Summary: Other personalities don't use this special frame slot.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14580

llvm-svn: 252778
2015-11-11 19:21:09 +00:00
Colin LeMahieu da6cafffc0 Reverting r252760
llvm-svn: 252770
2015-11-11 18:11:06 +00:00
Dehao Chen 72fdf444b7 Emit discriminator for inlined callsites.
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.

Reviewers: dnovillo, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14511

llvm-svn: 252768
2015-11-11 18:08:18 +00:00
Diego Novillo 0354a9f67b SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.

The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).

This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.

llvm-svn: 252763
2015-11-11 17:54:37 +00:00
Hemant Kulkarni c6638c7561 [Symbolizer]: Add -pretty-print option
Differential Revision: http://reviews.llvm.org/D13671

llvm-svn: 252760
2015-11-11 17:47:54 +00:00
Sanjay Patel f740129198 [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:
  jr  $ra
  clz  $2, $4

cttz:
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  jr  $ra
  subu  $2, $2, $1

Instead of:

ctlz:
  beqz  $4, $BB0_2
  addiu  $2, $zero, 32
  clz  $2, $4
$BB0_2:
  jr  $ra
  nop

cttz:
  beqz  $4, $BB1_2
  addiu  $2, $zero, 32
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  subu  $2, $2, $1
$BB1_2:
  jr  $ra
  nop

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14500

llvm-svn: 252755
2015-11-11 17:24:56 +00:00
Artyom Skrobov fe96beea57 test/DebugInfo/ARM/prologue_end.ll references thumbv1, which is invalid.
The committer didn't respond at http://reviews.llvm.org/D14338, so we've got to fix this for them.

This test doesn't pass with thumbv6, so I suppose what they meant is thumbv7.

llvm-svn: 252754
2015-11-11 17:22:18 +00:00
Daniel Sanders 70dd2d7ab9 [mips] Move MC tests for the DSP ASE into the standard format.
Summary:
Only DSPr2 is present because it appears we've never added DSPr1 tests.
We'll have to correct that in a later patch.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14448

llvm-svn: 252752
2015-11-11 16:50:13 +00:00
Douglas Katzman a14039764b Visibly fail if attempting to encode register AH,BH,CH,DH in a REX-prefixed instruction.
Differential Revision: http://reviews.llvm.org/D13316
Fixes PR25003

llvm-svn: 252743
2015-11-11 15:51:16 +00:00
James Molloy ce12c92f66 [ARM] Combine BFIs together
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.

llvm-svn: 252740
2015-11-11 15:40:40 +00:00
Michael Kuperstein 12982a816c [X86] Replace LEAs with INC/DEC when profitable
If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively.

Patch by: anton.nadolsky@intel.com
Differential Revision: http://reviews.llvm.org/D14059

llvm-svn: 252722
2015-11-11 11:44:31 +00:00
Yury Gribov d7731988ef [ASan] Enable optional ASan recovery.
Differential Revision: http://reviews.llvm.org/D14242

llvm-svn: 252719
2015-11-11 10:36:49 +00:00
Akira Hatanaka 5dda592643 Sort the enums in Attributes.h in case insensitive alphabetical order.
Sort the enums in preparation for moving the attributes to a table-gen
file.

rdar://problem/19836465

llvm-svn: 252692
2015-11-11 02:11:46 +00:00
Dan Gohman 754cd11d90 [WebAssembly] Support non-legal argument and return types.
llvm-svn: 252687
2015-11-11 01:33:02 +00:00
Ahmed Bougacha 4a85643907 [MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin.
Follow-up to r235963: this matches other assemblers and is less
unexpected (e.g. PR23227).

llvm-svn: 252681
2015-11-11 00:51:36 +00:00
Matt Arsenault 6690d7de39 AMDGPU: Set isAllocatable = 0 on VS_32/VS_64
llvm-svn: 252674
2015-11-11 00:01:32 +00:00
Sanjoy Das 925681053d [ValueTracking] Teach isImpliedCondition a new bitwise trick
Summary:
This change teaches isImpliedCondition to prove things like

  (A | 15) < L  ==>  (A | 14) < L

if the low 4 bits of A are known to be zero.

Depends on D14391

Reviewers: majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14392

llvm-svn: 252673
2015-11-10 23:56:20 +00:00
Reid Kleckner 7f84a939ed [WinEH] Insert the MBB for EH_RESTORE after the catchret
Inserting it before the target block could be bad, we might already have
a fallthrough edge to it.

llvm-svn: 252670
2015-11-10 23:22:20 +00:00
Dan Gohman b84ae9bb38 [WebAssembly] Support for floating point min and max.
llvm-svn: 252653
2015-11-10 21:40:21 +00:00
Bill Schmidt 34af5e1c76 [PowerPC] Add an MI SSA peephole pass.
This patch adds a pass for doing PowerPC peephole optimizations at the
MI level while the code is still in SSA form.  This allows for easy
modifications to the instructions while depending on a subsequent pass
of DCE.  Both passes are very fast due to the characteristics of SSA.

At this time, the only peepholes added are for cleaning up various
redundancies involving the XXPERMDI instruction.  However, I would
expect this will be a useful place to add more peepholes for
inefficiencies generated during instruction selection.  The pass is
placed after VSX swap optimization, as it is best to let that pass
remove unnecessary swaps before performing any remaining clean-ups.

The utility of these clean-ups are demonstrated by changes to four
existing test cases, all of which now have tighter expected code
generation.  I've also added Eric Schweiz's bugpoint-reduced test from
PR25157, for which we now generate tight code.  One other test started
failing for me, and I've fixed it
(test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not
related to my changes, and I'm not sure why it works before and not
after.  The problem is that the CHECK-NOT: of "statepoint" from test1
fails because of the "statepoint" in test2, and so forth.  Adding a
CHECK-LABEL in between keeps the different occurrences of that string
properly scoped.

llvm-svn: 252651
2015-11-10 21:38:26 +00:00
Adrian Prantl e39475d44d dsymutil: Prune module forward decl DIEs if a uniquable definition was
already emitted and fix a latent bug in DIECloner where the DW_CHILDREN_yes
flag is set based on the number of children in the input DIE rather than
the number of children that are actually being cloned.

rdar://problem/23439845

llvm-svn: 252649
2015-11-10 21:31:05 +00:00
Teresa Johnson 2d5fb8cac4 Ensure ModuleLinker materializes complete comdat groups
Summary:
The module linker lazy links some "discardable if unused" global
values (e.g. linkonce), materializing and linking them only
if they are referenced in the module. If a comdat group contains a
linkonce member that is not referenced, however, it would not be
materialized and linked, leading to an incomplete comdat group.

If there are other object files not part of the same LTO link that also
define and use that comdat group, the linker may select the incomplete
group leading to link time unsats.

To solve this, whenever a global value body is linked, make sure we
materialize any other members of the same comdat group that are not yet
materialized. This ensures they are in the lazy link list and get linked
as well.

Added new test and adjusted old test to remove parts that didn't
make sense with fix.

Reviewers: rafael

Subscribers: dexonsmith, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14516

llvm-svn: 252647
2015-11-10 21:09:06 +00:00
Sanjay Patel af1b48bfdc [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Differential Revision: http://reviews.llvm.org/D14469

llvm-svn: 252639
2015-11-10 19:24:31 +00:00
Yunzhong Gao ef436f068c llvm-lto: trivial spelling changes to distinguish custom diagnostic handler and
default diagnostic handler.

Differential Revision: http://reviews.llvm.org/D14520

llvm-svn: 252633
2015-11-10 18:52:48 +00:00
Philip Reames 2d858747df [ValueTracking] Recognize that and(x, add (x, -1)) clears the low bit
This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool.

If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared.

Differential Revision: http://reviews.llvm.org/D14315

llvm-svn: 252629
2015-11-10 18:46:14 +00:00
Teresa Johnson 3cd8161c9b [ThinLTO] WeakAny fixes/cleanup
Ensure WeakAny variables are imported as ExternalWeak declarations. To
handle WeakAny more consistently and fix this issue:

1) Update helper doImportAsDefinition to properly flag WeakAny variables
   and aliases as not importing defintions.

   Update callers of doImportAsDefinition to remove now redundant checks for
   WeakAny aliases, or ignore aliases, as appropriate.

2) Add any !doImportAsDefinition GVs to DoNotLinkFromSource set during
   linking of the GV prototype, where we usually add GVs to the
   DoNotLinkFromSource set for other reasons.

   Remove now unnecessary adding of WeakAny aliases to
   DoNotLinkFromSource set from copyGlobalAliasProto.

   Remove now unnecessary guard against linking non-imported function
   bodies from ModuleLinker::run.

llvm-svn: 252626
2015-11-10 18:20:11 +00:00
Sanjay Patel 241c31fb64 [AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any AArch64
implementation.

The net result of allowing this speculation for the regression tests in this
patch is that we get this code:

ctlz:
  clz  w0, w0
  ret

cttz:
  rbit  w8, w0
  clz  w0, w8
  ret

Instead of:

ctlz:
  cbz  w0, .LBB0_2
  clz  w0, w0
  ret
.LBB0_2:
  orr  w0, wzr, #0x20
  ret

cttz:
  cbz  w0, .LBB1_2
  rbit  w8, w0
  clz  w0, w8
  ret
.LBB1_2:
  orr  w0, wzr, #0x20
  ret

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14505

llvm-svn: 252625
2015-11-10 18:11:37 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00