llvm-project

Commit Graph

Author	SHA1	Message	Date
Suyog Sarda	43fae93da8	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424	2014-12-17 10:34:27 +00:00
Erik Eckstein	a451b9b0b5	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417	2014-12-17 07:29:19 +00:00
Duncan P. N. Exon Smith	92731d26bc	Revert "Linker: Drop superseded subprograms" This reverts commit r224389. Based on feedback from the bots, the assertion seems to be going off more often, not less (previously I was just seeing it in an internal bootstrap, now it's happening in public builds too). http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/936/ http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/5325 Reverting in order to investigate. llvm-svn: 224416	2014-12-17 07:27:31 +00:00
Justin Hibbits	0c0d5deff1	Add parsing of 'foo@local". Summary: Currently, it supports generating, but not parsing, this expression. Test added as well. Test Plan: New test added, no regressions due to this. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6672 llvm-svn: 224415	2014-12-17 06:23:35 +00:00
Rafael Espindola	5f06030989	Remove a debugging assert. Sorry for the noise, I have no idea how it survived to the final version. llvm-svn: 224414	2014-12-17 03:38:04 +00:00
Rafael Espindola	81adfb5c2e	Fix the windows build. llvm-svn: 224412	2014-12-17 02:42:20 +00:00
Rafael Espindola	97935a9123	Refactor and simplify the code reading /proc/cpuinfo. NFC. llvm-svn: 224410	2014-12-17 02:32:44 +00:00
Matthias Braun	f4a72cd06e	RegisterCoalescer: Sprinkle some const modifiers. llvm-svn: 224409	2014-12-17 02:18:13 +00:00
Nick Lewycky	52ee5e446b	Delete debugging cruft that crept in with r223802. llvm-svn: 224407	2014-12-17 01:56:51 +00:00
David Majnemer	65c52ae8ca	InstSimplify: shl nsw/nuw undef, %V -> undef We can always choose an value for undef which might cause %V to shift out an important bit except for one case, when %V is zero. However, shl behaves like an identity function when the right hand side is zero. llvm-svn: 224405	2014-12-17 01:54:33 +00:00
Nick Lewycky	ee0a3a7a2f	Make ValueEnumerator::print use OS for metadata too. Noticed by inspection. llvm-svn: 224404	2014-12-17 01:52:08 +00:00
Quentin Colombet	fc2201e922	[CodeGenPrepare] Reapply r224351 with a fix for the assertion failure: The type promotion helper does not support vector type, so when make such it does not kick in in such cases. Original commit message: [CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224402	2014-12-17 01:36:17 +00:00
Kevin Enderby	57538299e8	Add printing the LC_ENCRYPTION_INFO_64 load command with llvm-objdump’s -private-headers and add tests for the two AArch64 binaries. llvm-svn: 224400	2014-12-17 01:01:30 +00:00
David Blaikie	8b979f01c6	PR21875: codegen for non-type template parameters of nullptr_t type llvm-svn: 224399	2014-12-17 00:43:22 +00:00
Reid Kleckner	04b69f89aa	Revert "[CodeGenPrepare] Move sign/zero extensions near loads using type promotion." This reverts commit r224351. It causes assertion failures when building ICU. llvm-svn: 224397	2014-12-17 00:29:23 +00:00
Hans Wennborg	224cb82a39	SelectionDAG switch lowering: use 'unsigned' to count destination popularity SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases seems unnecessary. Also fix a missing CHECK in the test case. llvm-svn: 224393	2014-12-16 23:41:59 +00:00
Colin LeMahieu	aa1bade7b4	[Hexagon] Updating doubleword shift usages to new versions. llvm-svn: 224391	2014-12-16 23:36:15 +00:00
Kevin Enderby	0804f467f2	Add printing the LC_ENCRYPTION_INFO load command with llvm-objdump’s -private-headers. llvm-svn: 224390	2014-12-16 23:25:52 +00:00
Duncan P. N. Exon Smith	8759026893	Linker: Drop superseded subprograms When a function gets replaced by `ModuleLinker`, drop superseded subprograms. This ensures that the "first" subprogram pointing at a function is the same one that `!dbg` references point at. This is a stop-gap fix for PR21910. Notably, this fixes Release+Asserts bootstraps that are currently asserting out in `LexicalScopes::initialize()` due to the explicit instantiations in `lib/IR/Dominators.cpp` eventually getting replaced by -argpromotion. llvm-svn: 224389	2014-12-16 23:23:41 +00:00
Simon Pilgrim	bf1e079005	[X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps) Added a missing memory folding relationship for the (V)CVTPD2PS instruction - we can safely fold these for stack reloads. Differential Revision: http://reviews.llvm.org/D6663 llvm-svn: 224383	2014-12-16 22:30:10 +00:00
Rafael Espindola	9573a9cf9d	Make the assert a bit stronger. We should get no declarations in here. llvm-svn: 224382	2014-12-16 22:29:43 +00:00
Colin LeMahieu	7fc90fc7e9	[Hexagon] Removing old XTYPE/BIT instructions and replacing usages. llvm-svn: 224381	2014-12-16 22:17:09 +00:00
Sanjay Patel	7129c10cae	merge consecutive loads that are offset from a base address SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads when the first load was offset from a base address. This patch recognizes that pattern and subtracts the offset before comparing the second load to see if it is consecutive. The codegen change in the new test case improves from: vmovsd 32(%rdi), %xmm0 vmovsd 48(%rdi), %xmm1 vmovhpd 56(%rdi), %xmm1, %xmm1 vmovhpd 40(%rdi), %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 To: vmovups 32(%rdi), %ymm0 An existing test case is also improved from: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovsd 24(%rdi), %xmm2 vunpcklpd %xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0] vmovhpd 8(%rdi), %xmm1, %xmm3 To: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovhpd 24(%rdi), %xmm0, %xmm0 vmovhpd 8(%rdi), %xmm1, %xmm1 This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ). Differential Revision: http://reviews.llvm.org/D6642 llvm-svn: 224379	2014-12-16 21:57:18 +00:00
Colin LeMahieu	f5acc8c625	[Hexagon] Adding tstbit/bitclr/bitset instructions. llvm-svn: 224374	2014-12-16 21:28:58 +00:00
Kostya Serebryany	7376294086	[sanitizer] prevent function call merging for sanitizer-coverage callbacks llvm-svn: 224372	2014-12-16 21:24:15 +00:00
Colin LeMahieu	615757f2f1	[Hexagon] Adding bit count and twiddling instructions. llvm-svn: 224367	2014-12-16 20:57:56 +00:00
Colin LeMahieu	6fce46baf6	[Hexagon] Adding asr/lsr/asl reg/imm, asl with saturation, asr with rounding. Doubleword abs/neg/not. Interleave and deinterleave instructions. llvm-svn: 224365	2014-12-16 20:40:23 +00:00
JF Bastien	5d3280c7a7	x86-32: PUSHF/POPF use/def EFLAGS Summary: As a side-quest for D6629 jvoung pointed out that I should use -verify-machineinstrs and this found a bug in x86-32's handling of EFLAGS for PUSHF/POPF. This patch fixes the use/def, and adds -verify-machineinstrs to all x86 tests which contain 'EFLAGS'. One exception: this patch leaves inline-asm-fpstack.ll as-is because it fails -verify-machineinstrs in a way unrelated to EFLAGS. This patch also modifies cmpxchg-clobber-flags.ll along the lines of what D6629 already does by also testing i386. Test Plan: ninja check Reviewers: t.p.northover, jvoung Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6687 llvm-svn: 224359	2014-12-16 20:15:45 +00:00
Rafael Espindola	a4a94f1b55	Use CastInst::castIsValid to simplify the verifier. Also delete a dead member variable. llvm-svn: 224356	2014-12-16 19:29:29 +00:00
Matt Arsenault	31a52ad48c	NVPTX: Remove duplicate of AsmPrinter::lowerConstant llvm-svn: 224355	2014-12-16 19:16:17 +00:00
Matt Arsenault	dd3b77d64c	Move lowerConstant to AsmPrinter This was a static function before, and NVPTX duplicated it because it wasn't exposed. llvm-svn: 224354	2014-12-16 19:16:14 +00:00
Quentin Colombet	d5e57b731f	[CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224351	2014-12-16 19:09:03 +00:00
Robert Khasanov	d04cd2fbfe	[AVX512] Enable integer arithmetic lowering for AVX512BW/VL subsets. Added lowering tests. llvm-svn: 224349	2014-12-16 18:24:07 +00:00
Colin LeMahieu	1944a8cd04	[Hexagon] Adding absolute value, and negate with saturation llvm-svn: 224346	2014-12-16 17:44:49 +00:00
Sanjay Patel	e46d54f0bf	combine consecutive subvector 16-byte loads into one 32-byte load This is a fix for PR21709 ( http://llvm.org/bugs/show_bug.cgi?id=21709 ). When we have 2 consecutive 16-byte loads that are merged into one 32-byte vector, we can use a single 32-byte load instead. But we don't do this for SandyBridge / IvyBridge because they have slower 32-byte memops. We also don't bother using 32-byte integer loads on a machine that only has AVX1 (btver2) because those operands would have to be split in half anyway since there is no support for 32-byte integer math ops. Differential Revision: http://reviews.llvm.org/D6492 llvm-svn: 224344	2014-12-16 16:30:01 +00:00
Colin LeMahieu	455f24aa77	[Hexagon] Adding saturate and swizzle instructions. llvm-svn: 224343	2014-12-16 16:27:17 +00:00
Robert Khasanov	8d9b93eac8	[AVX512] Add a comment for avx512_broadcast_pat multiclass llvm-svn: 224341	2014-12-16 16:12:11 +00:00
Colin LeMahieu	d9b23509bf	[Hexagon] Removing old multiply defs and updating references to new versions. llvm-svn: 224340	2014-12-16 16:10:01 +00:00
Vladimir Medic	e88609388a	The single check for N64 inside MipsDisassemblerBase's subclasses is actually wrong. It should be testing for FeatureGP64bit.There are no functional changes. llvm-svn: 224339	2014-12-16 15:29:12 +00:00
Zoran Jovanovic	2deca34803	[mips][microMIPS] Implement SWP and LWP instructions Differential Revision: http://reviews.llvm.org/D5667 llvm-svn: 224338	2014-12-16 14:59:10 +00:00
Aaron Ballman	0d6a010c13	Fixing -Wsign-compare warnings; NFC. llvm-svn: 224337	2014-12-16 14:04:11 +00:00
Elena Demikhovsky	f5b72afff4	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334	2014-12-16 11:50:42 +00:00
Bradley Smith	ececb7f6e2	[ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanes This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332	2014-12-16 10:59:27 +00:00
Elena Demikhovsky	a79fc16bb0	X86: Added FeatureVectorUAMem for all AVX architectures. According to AVX specification: "Most arithmetic and data processing instructions encoded using the VEX prefix and performing memory accesses have more flexible memory alignment requirements than instructions that are encoded without the VEX prefix. Specifically, With the exception of explicitly aligned 16 or 32 byte SIMD load/store instructions, most VEX-encoded, arithmetic and data processing instructions operate in a flexible environment regarding memory address alignment, i.e. VEX-encoded instruction with 32-byte or 16-byte load semantics will support unaligned load operation by default. Memory arguments for most instructions with VEX prefix operate normally without causing #GP(0) on any byte-granularity alignment (unlike Legacy SSE instructions)." The same for AVX-512. This change does not affect anything right now, because only the "memop pattern fragment" depends on FeatureVectorUAMem and it is not used in AVX patterns. All AVX patterns are based on the "unaligned load" anyway. llvm-svn: 224330	2014-12-16 09:10:08 +00:00
Duncan P. N. Exon Smith	bb7d2fb1e5	IR: Stop printing 'metadata' in Metadata::print() Stop printing `metadata` in `Metadata::print()` and `Metadata::printAsOperand()`. llvm-svn: 224327	2014-12-16 07:40:31 +00:00
Duncan P. N. Exon Smith	fee167fcc0	IR: Make MDNode::dump() useful by adding addresses It's horrible to inspect `MDNode`s in a debugger. All of their operands that are `MDNode`s get dumped as `<badref>`, since we can't assign metadata slots in the context of a `Metadata::dump()`. (Why not? Why not assign numbers lazily? Because then each time you called `dump()`, a given `MDNode` could have a different lazily assigned number.) Fortunately, the C memory model gives us perfectly good identifiers for `MDNode`. Add pointer addresses to the dumps, transforming this: (lldb) e N->dump() !{i32 662302, i32 26, <badref>, null} (lldb) e ((MDNode)N->getOperand(2))->dump() !{i32 4, !"foo"} into: (lldb) e N->dump() !{i32 662302, i32 26, <0x100706ee0>, null} (lldb) e ((MDNode)0x100706ee0)->dump() !{i32 4, !"foo"} and this: (lldb) e N->dump() 0x101200248 = !{<badref>, <badref>, <badref>, <badref>, <badref>} (lldb) e N->getOperand(0) (const llvm::MDOperand) $0 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(1) (const llvm::MDOperand) $1 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(2) (const llvm::MDOperand) $2 = { MD = 0x0000000101200058 } (lldb) e N->getOperand(3) (const llvm::MDOperand) $3 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(4) (const llvm::MDOperand) $4 = { MD = 0x0000000101200058 } (lldb) e ((MDNode)0x00000001012004e0)->dump() !{} (lldb) e ((MDNode)0x0000000101200058)->dump() !{null} into: (lldb) e N->dump() !{<0x1012004e0>, <0x1012004e0>, <0x101200058>, <0x1012004e0>, <0x101200058>} (lldb) e ((MDNode)0x1012004e0)->dump() !{} (lldb) e ((MDNode)0x101200058)->dump() !{null} llvm-svn: 224325	2014-12-16 07:09:37 +00:00
Saleem Abdulrasool	417fc6b303	ARM: diagnose deprecated syntax The use of SP and PC in the register list for stores is deprecated on ARM (ARM ARM A.8.8.199): ARM deprecates the use of ARM instructions that include the SP or the PC in the list. Provide a deprecation warning from the assembler in the case that the syntax is ever seen. llvm-svn: 224319	2014-12-16 05:53:25 +00:00
Hal Finkel	8adf2254ef	[PowerPC] Improve instruction selection bit-permuting operations (32-bit) The PowerPC backend, somewhat embarrassingly, did not generate an optimal-length sequence of instructions for a 32-bit bswap. While adding a pattern for the bswap intrinsic to fix this would not have been terribly difficult, doing so would not have addressed the real problem: we had been generating poor code for many bit-permuting operations (by which I mean things like byte-swap that permute the bits of one or more inputs around in various ways). Here are some initial steps toward solving this deficiency. Bit-permuting operations are represented, at the SDAG level, using ISD::ROTL, SHL, SRL, AND and OR (mostly with constant second operands). Looking back through these operations, we can build up a description of the bits in the resulting value in terms of bits of one or more input values (and constant zeros). For each bit, we compute the rotation amount from the original value, and then group consecutive (value, rotation factor) bits into groups. Groups sharing these attributes are then collected and sorted, and we can then instruction select the entire permutation using a combination of masked rotations (rlwinm), imm ands (andi/andis), and masked rotation inserts (rlwimi). The result is that instead of lowering an i32 bswap as: rlwinm 5, 3, 24, 16, 23 rlwinm 4, 3, 24, 0, 7 rlwimi 4, 3, 8, 8, 15 rlwimi 5, 3, 8, 24, 31 rlwimi 4, 5, 0, 16, 31 we now produce: rlwinm 4, 3, 8, 0, 31 rlwimi 4, 3, 24, 16, 23 rlwimi 4, 3, 24, 0, 7 and for the 'test6' example in the PowerPC/README.txt file: unsigned test6(unsigned x) { return ((x & 0x00FF0000) >> 16) \| ((x & 0x000000FF) << 16); } we used to produce: lis 4, 255 rlwinm 3, 3, 16, 0, 31 ori 4, 4, 255 and 3, 3, 4 and now we produce: rlwinm 4, 3, 16, 24, 31 rlwimi 4, 3, 16, 8, 15 and, as a nice bonus, this fixes the FIXME in test/CodeGen/PowerPC/rlwimi-and.ll. This commit does not include instruction-selection for i64 operations, those will come later. llvm-svn: 224318	2014-12-16 05:51:41 +00:00
Saleem Abdulrasool	08408ea86e	ARM: 80-column clang-format a function with an overly long string constant. NFC. llvm-svn: 224314	2014-12-16 04:10:10 +00:00
Matthias Braun	1aed6ffa35	LiveRangeCalc: Rewrite subrange calculation This changes subrange calculation to calculate subranges sequentially instead of in parallel. The code is easier to understand that way and addresses the code review issues raised about LiveOutData being hard to understand/needing more comments by removing them :) llvm-svn: 224313	2014-12-16 04:03:38 +00:00
Rafael Espindola	a23008ad4b	Remove the last unnecessary member variable of mapped_file_region. NFC. llvm-svn: 224312	2014-12-16 03:10:29 +00:00
Rafael Espindola	369d514616	Convert a member variable to a local variable. NFC. llvm-svn: 224311	2014-12-16 02:53:35 +00:00
Rafael Espindola	986f5adf8d	Remove unused member and simplify. NFC. llvm-svn: 224309	2014-12-16 02:19:26 +00:00
Rafael Espindola	9d1020648c	Start adding thin archive support. This is just sufficient for 'ar t' to work. llvm-svn: 224307	2014-12-16 01:43:41 +00:00
Adrian Prantl	b9fa945d51	ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294	2014-12-16 00:20:49 +00:00
Colin LeMahieu	d9a00a9c38	[Hexagon] Adding doubleword multiplies with and without accumulation. llvm-svn: 224293	2014-12-16 00:07:24 +00:00
Michael Ilseman	8210281d13	Sink the isa into the assert llvm-svn: 224291	2014-12-15 23:41:21 +00:00
Colin LeMahieu	18c927620a	[Hexagon] Adding halfword to doubleword multiplies. llvm-svn: 224289	2014-12-15 23:29:37 +00:00
Colin LeMahieu	64ffd52943	[Hexagon] Adding logical-logical accumulation instructions and tests. llvm-svn: 224288	2014-12-15 23:19:07 +00:00
Sanjoy Das	4555b6d870	Teach ScalarEvolution to exploit min and max expressions when proving isKnownPredicate. The motivation for this change is to optimize away checks in loops like this: limit = min(t, len) for (i = 0 to limit) if (i >= len \|\| i < 0) throw_array_of_of_bounds(); a[i] = ... Differential Revision: http://reviews.llvm.org/D6635 llvm-svn: 224285	2014-12-15 22:50:15 +00:00
JF Bastien	388b8794c9	x86: Emit LOCK prefix after DATA16 Summary: x86 allows either ordering for the LOCK and DATA16 prefixes, but using GCC+GAS leads to different code generation than using LLVM. This change matches the order that GAS emits the x86 prefixes when a semicolon isn't used in inline assembly (see tc-i386.c comment before define LOCK_PREFIX), and helps simplify tooling that operates on the instruction's byte sequence (such as NaCl's validator). This change shouldn't have any performance impact. Test Plan: ninja check Reviewers: craig.topper, jvoung Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6630 llvm-svn: 224283	2014-12-15 22:34:58 +00:00
Colin LeMahieu	71e11a1d0d	[Hexagon] Adding a number of additional multiply forms with tests. llvm-svn: 224282	2014-12-15 22:10:37 +00:00
Michael Ilseman	00a6087f6b	Clean up warning about unused variable llvm-svn: 224281	2014-12-15 21:47:09 +00:00
Matthias Braun	c3a72c2e5f	Revert "LiveRangeCalc: Rewrite subrange calculation" Revert until I find out why non-subreg enabled targets break. This reverts commit 6097277eefb9c5fb35a7f493c783ee1fd1b9d6a7. llvm-svn: 224278	2014-12-15 21:36:35 +00:00
Michael Ilseman	1c38396db7	Revert of r223763, in spirit. r223763 was made to work around a temporary issue where a user of the JIT was passing down a declaration (incorrectly). This shouldn't occur, so assert rather than silently continue. llvm-svn: 224277	2014-12-15 21:36:29 +00:00
Mark Heffernan	acbed5e530	Clarify HowFarToZero computation when the step is a positive power of two. Functionally this should be identical to the existing code except for the case where Step is maximally negative (eg, INT_MIN). We now punt in that one corner case to make reasoning about the code easier. llvm-svn: 224274	2014-12-15 21:19:53 +00:00
Colin LeMahieu	4a46429305	[Hexagon] Adding misc multiply encodings and tests. llvm-svn: 224273	2014-12-15 21:17:03 +00:00
Matthias Braun	0352201e15	LiveRangeCalc: Rewrite subrange calculation This changes subrange calculation to calculate subranges sequentially instead of in parallel. The code is easier to understand that way and addresses the code review issues raised about LiveOutData being hard to understand/needing more comments by removing them :) llvm-svn: 224272	2014-12-15 21:16:21 +00:00
Colin LeMahieu	26f884aedf	[Hexagon] Adding doubleworld accumulating multiplies of halfwords. llvm-svn: 224267	2014-12-15 20:17:46 +00:00
Colin LeMahieu	572c53e258	[Hexagon] Adding accumulating half word multiplies. llvm-svn: 224266	2014-12-15 20:10:28 +00:00
Colin LeMahieu	d1704cdc07	[Hexagon] Adding multiply with rnd/sat/rndsat llvm-svn: 224265	2014-12-15 20:01:59 +00:00
Matthias Braun	42fab34ffb	LiveRangeCalc: use more range based for loops; NFC llvm-svn: 224263	2014-12-15 19:40:46 +00:00
Colin LeMahieu	fe4012a969	[Hexagon] Adding encoding bits for halfword multiplies. llvm-svn: 224261	2014-12-15 19:22:07 +00:00
Ahmed Bougacha	c2a87ddf01	[X86] Also pretty-print shuffle mask for INSERTPS rm variants. llvm-svn: 224260	2014-12-15 19:17:54 +00:00
Duncan P. N. Exon Smith	be7ea19b58	IR: Make metadata typeless in assembly Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257	2014-12-15 19:07:53 +00:00
Michael Ilseman	addddc441f	Silence more static analyzer warnings. Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255	2014-12-15 18:48:43 +00:00
Vladimir Medic	d7ecf49e97	Add disassembler tests for mips3 platform. There are no functional changes. llvm-svn: 224253	2014-12-15 16:19:34 +00:00
Aaron Ballman	2d67fd6d64	Changing a cast from unsigned to uint64_t, should be NFC in practice. llvm-svn: 224249	2014-12-15 14:25:12 +00:00
Elena Demikhovsky	a5599bfd72	Sink store based on alias analysis - by Ella Bolshinsky The alias analysis is used define whether the given instruction is a barrier for store sinking. For 2 identical stores, following instructions are checked in the both basic blocks, to determine whether they are sinking barriers. http://reviews.llvm.org/D6420 llvm-svn: 224247	2014-12-15 14:09:53 +00:00
Michael Kuperstein	47c97157ef	[X86] Break false dependencies before partial register updates when the source operand is in memory Adds the various "rm" instruction variants into the list of instructions that have a partial register update. Also adds all variants of SQRTSD that were missing in the original list. Differential Revision: http://reviews.llvm.org/D6620 llvm-svn: 224246	2014-12-15 13:18:21 +00:00
Elena Demikhovsky	72860c341e	AVX-512: Added EXPAND instructions and intrinsics. llvm-svn: 224241	2014-12-15 10:03:52 +00:00
Alexey Bataev	48be28ef1a	Fix line mapping information in LLVM JIT profiling with Vtune The line mapping information for dynamic code is reported incorrectly. It causes VTune to map LLVM generated code to source lines incorrectly. This patch fix this issue. Patch by Denis Pravdin. Differential Revision: http://reviews.llvm.org/D6603 llvm-svn: 224229	2014-12-15 04:45:43 +00:00
David Majnemer	c175dd2ea5	ThreadLocal: Move Unix-specific code out of Support/ThreadLocal.cpp Just a cleanup, no functionality change is intended. llvm-svn: 224227	2014-12-15 01:19:53 +00:00
David Majnemer	421c89debc	ThreadLocal: Return a mutable pointer if templated with a non-const type It makes more sense for ThreadLocal<const T>::get to return a const T* and ThreadLocal<T>::get to return a T*. llvm-svn: 224225	2014-12-15 01:04:45 +00:00
Elena Demikhovsky	3fcafa2cdb	Loop Vectorizer minor changes in the code - some comments, function names, identation. Reviewed here: http://reviews.llvm.org/D6527 llvm-svn: 224218	2014-12-14 09:43:50 +00:00
David Majnemer	7f03920dad	APInt: udivrem should use machine instructions for single-word APInts This mirrors the behavior of APInt::udiv and APInt::urem. Some architectures, like X86, have a single instruction which can compute both division and remainder. llvm-svn: 224217	2014-12-14 09:41:56 +00:00
David Majnemer	4e87936d2f	ScalarEvolution: Remove SCEVUDivision, it's unused This is just a code simplification, no functionality change is intended. llvm-svn: 224216	2014-12-14 09:12:33 +00:00
Hal Finkel	4104a1a346	[PowerPC] Handle cmp op promotion for SELECT[_CC] nodes in PPCTL::DAGCombineExtBoolTrunc PPCTargetLowering::DAGCombineExtBoolTrunc contains logic to remove unwanted truncations and extensions when dealing with nodes of the form: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) There was a FIXME in the implementation (now removed) regarding the fact that the function would abort the transformations if any of the non-output operands of a SELECT or SELECT_CC node would need to be promoted (because they were also output operands, for example). As a result, we continued to generate unnecessary zero-extends for code such as this: unsigned foo(unsigned a, unsigned b) { return (a <= b) ? a : b; } which would produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 rldicl 3, 3, 0, 32 blr and now we produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 blr which is better in the obvious way. llvm-svn: 224213	2014-12-14 05:53:19 +00:00
Ahmed Bougacha	0cb861634b	Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores." r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203	2014-12-13 23:22:12 +00:00
Renato Golin	df8f9b6dc9	Revert "[ARM] Combine base-updating/post-incrementing vector load/stores." This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198	2014-12-13 20:23:18 +00:00
Aaron Ballman	786a86cb13	Silencing a -Wsign-compare warning; NFC. llvm-svn: 224195	2014-12-13 16:55:02 +00:00
Akira Hatanaka	7ba78302b5	Rename argument strings of codegen passes to avoid collisions with command line options. This commit changes the command line arguments (PassInfo::PassArgument) of two passes, MachineFunctionPrinter and MachineScheduler, to avoid collisions with command line options that have the same argument strings. This bug manifests when the PassList construct (defined in opt.cpp) is used in a tool that links with codegen passes. To reproduce the bug, paste the following lines into llc.cpp and run llc. #include "llvm/IR/LegacyPassNameParser.h" static llvm:🆑:list<const llvm::PassInfo*, bool, llvm::PassNameParser> PassList(llvm:🆑:desc("Optimizations available:")); rdar://problem/19212448 llvm-svn: 224186	2014-12-13 04:52:04 +00:00
Hal Finkel	4c6658feb0	[PowerPC] Add a DAGToDAG peephole to remove unnecessary zero-exts On PPC64, we end up with lots of i32 -> i64 zero extensions, not only from all of the usual places, but also from the ABI, which specifies that values passed are zero extended. Almost all 32-bit PPC instructions in PPC64 mode are defined to do something to the higher-order bits, and for some instructions, that action clears those bits (thus providing a zero-extended result). This is especially common after rotate-and-mask instructions. Adding an additional instruction to zero-extend the results of these instructions is unnecessary. This PPCISelDAGToDAG peephole optimization examines these zero-extensions, and looks back through their operands to see if all instructions will implicitly zero extend their results. If so, we convert these instructions to their 64-bit variants (which is an internal change only, the actual encoding of these instructions is the same as the original 32-bit ones) and remove the unnecessary zero-extension (changing where the INSERT_SUBREG instructions are to make everything internally consistent). llvm-svn: 224169	2014-12-12 23:59:36 +00:00
David Majnemer	9b6097586c	ValueTracking: Don't recurse too deeply in computeKnownBitsFromAssume Respect the MaxDepth recursion limit, doing otherwise will trigger an assert in computeKnownBits. This fixes PR21891. llvm-svn: 224168	2014-12-12 23:59:29 +00:00
Chad Rosier	620fb2206d	[ARMConstantIsland] Insert tbb/tbh optimization where previous jump table resided. llvm-svn: 224165	2014-12-12 23:27:40 +00:00
Yaron Keren	a604d6c84d	Pass EC by reference to MemoryBufferMMapFile to return error code. Patch by Kim Grasman! llvm-svn: 224159	2014-12-12 22:27:53 +00:00
Michael Ilseman	5be22a12c2	Clean up static analyzer warnings. Clang's static analyzer found several potential cases of undefined behavior, use of un-initialized values, and potentially null pointer dereferences in tablegen, Support, MC, and ADT. This cleans them up with specific assertions on the assumptions of the code. llvm-svn: 224154	2014-12-12 21:48:03 +00:00
Colin LeMahieu	90482a77b1	[Hexagon] Adding double word add/min/minu/max/maxu instructions and tests. llvm-svn: 224153	2014-12-12 21:29:25 +00:00
Colin LeMahieu	984ef17d66	[Hexagon] Adding J class call instructions. llvm-svn: 224150	2014-12-12 21:12:27 +00:00
Duncan P. N. Exon Smith	121eeff4f3	IR: Don't track nullptr on metadata RAUW The RAUW support in `Metadata` supports going to `nullptr` specifically to handle values being deleted, causing `ValueAsMetadata` to be deleted. Fix the case where the reference is from a `TrackingMDRef` (as opposed to an `MDOperand` or a `MetadataAsValue`). This is surprisingly rare -- metadata tracked by `TrackingMDRef` going to null -- but it came up in an openSUSE bootstrap during inlining. The tracking ref was held by the `ValueMap` because it was referencing a local, the basic block containing the local became dead after it had been merged in, and when the local was deleted, the tracking ref asserted in an `isa`. llvm-svn: 224146	2014-12-12 19:24:33 +00:00
Rafael Espindola	6de938b002	MAP_FILE is the default. We don't need to add it. llvm-svn: 224144	2014-12-12 19:12:42 +00:00
Steven Wu	f179d12e50	More code format fix from r224133, NFC llvm-svn: 224140	2014-12-12 18:48:37 +00:00
Rafael Espindola	275e342ca9	Remove silly left over from the Windows resize_file implementation. I didn't notice the problem first because on a non debug build the CRT was just exiting the process without any message. llvm-svn: 224139	2014-12-12 18:37:43 +00:00
Rafael Espindola	c69f13bfe1	Move the resize file feature from mapped_file_region to the only user. This removes a duplicated stat on every file that llvm-ar looks at. llvm-svn: 224138	2014-12-12 18:13:23 +00:00
Rafael Espindola	59aaa6c06b	Pass a FD to resise_file and add a testcase. I will add a real use in another commit. llvm-svn: 224136	2014-12-12 17:55:12 +00:00
Rafael Espindola	5753cf3c63	Remove unused feature. NFC. llvm-svn: 224135	2014-12-12 17:35:34 +00:00
Steven Wu	1f7402a14e	Restructure code from r224097. NFC llvm-svn: 224133	2014-12-12 17:21:54 +00:00
Robert Khasanov	37c3ad6c20	[AVX512] Enabling bit logic lowering Added lowering tests. llvm-svn: 224132	2014-12-12 17:02:18 +00:00
Vasileios Kalintiris	8edbcad8e5	[mips] Enable code generation for MIPS-III. Summary: This commit enables the MIPS-III target and adds support for code generation of SELECT nodes. We have to use pseudo-instructions with custom inserters for these nodes as MIPS-III CPUs do not have conditional-move instructions. Depends on D6212 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6464 llvm-svn: 224128	2014-12-12 15:16:46 +00:00
Robert Khasanov	e82a3630b7	[AVX512] Enabling MIN/MAX lowering. Added lowering tests. llvm-svn: 224127	2014-12-12 15:10:43 +00:00
Andrea Di Biagio	d65fd9facd	Reapply "[MachineScheduler] Fix for PR21807: minor code difference building with/without -g." This reapplies r224118 with a fix for test 'misched-code-difference-with-debug.ll'. That test was failing on some buildbots because it was x86 specific but it was missing a target triple. Added an explicit triple to test misched-code-difference-with-debug.ll. llvm-svn: 224126	2014-12-12 15:09:58 +00:00
Chad Rosier	78943bcc18	[Reassociate] Use dbgs() instead of errs(). llvm-svn: 224125	2014-12-12 14:44:12 +00:00
Vasileios Kalintiris	f53f785a6e	[mips] Support SELECT nodes for targets that don't have conditional-move instructions. Summary: For Mips targets that do not have conditional-move instructions, ie. targets before MIPS32 and MIPS-IV, we have to insert a diamond control-flow pattern in order to support SELECT nodes. In order to do that, we add pseudo-instructions with a custom inserter that emits the necessary control-flow that selects the correct value. With this patch we add complete support for code generation of Mips-II targets based on the LLVM test-suite. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6212 llvm-svn: 224124	2014-12-12 14:41:37 +00:00
Robert Khasanov	4204c1acc6	[AVX512] Minor fix in lowering pattern for broadcast intrustions. No functional change. llvm-svn: 224122	2014-12-12 14:21:30 +00:00
Andrea Di Biagio	5634a54efc	Revert: [MachineScheduler] Fix for PR21807: minor code difference building with/without -g. Test 'misched-code-difference-with-debug.ll' was failing on some buildbots. llvm-svn: 224121	2014-12-12 13:34:03 +00:00
Suyog Sarda	384095e65c	This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it. Test case : float hadd(float* a) { return (a[0] + a[1]) + (a[2] + a[3]); } AArch64 assembly before patch : ldp s0, s1, [x0] ldp s2, s3, [x0, #8] fadd s0, s0, s1 fadd s1, s2, s3 fadd s0, s0, s1 ret AArch64 assembly after patch : ldp d0, d1, [x0] fadd v0.2s, v0.2s, v1.2s faddp s0, v0.2s ret Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html llvm-svn: 224119	2014-12-12 12:53:44 +00:00
Andrea Di Biagio	01236e3eca	[MachineScheduler] Fix for PR21807: minor code difference building with/without -g. This patch fixes the issue reported as PR21807. There was a minor difference in the generated code depending on the -g flag. The cause was that with -g the machine scheduler used a different scheduling strategy. This decision was based on the number of instructions in a schedule region and included debug instructions in that count. This patch fixes the issue in MISched and provides a test. Patch by Russell Gallop! llvm-svn: 224118	2014-12-12 12:41:22 +00:00
Charlie Turner	1a53996c31	Emit Tag_ABI_FP_16bit_format build attribute. The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet supported, there is not a user switch to change this behaviour. This build attribute should capture the default behaviour of the compiler, which is to expose the IEEE 754 version of __fp16. When -mfp16-format is emitted, that will be the way to control the value of this build attribute. Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0 llvm-svn: 224115	2014-12-12 11:59:18 +00:00
Ekaterina Romanova	90ff20d8f5	A fix for PR21176. DW_OP_const <const> doesn't describe a constant value, but a value at a constant address. The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value. Added DW_OP_stack_value to the stack. Marked incorrect-variable-debugloc1.ll to xfail for PowerPC64, while the the failure (PR21881) is being investigated. llvm-svn: 224098	2014-12-12 05:11:47 +00:00
Steven Wu	881916dea5	Fix another infinite loop in InstCombine Summary: InstCombine infinite-loops for the testcase added It is because InstCombine is generating instructions that can be optimized by itself. Fix by not optimizing frem if the optimized type is the same as original type. rdar://problem/19150820 Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6634 llvm-svn: 224097	2014-12-12 04:34:07 +00:00
Matt Arsenault	1e3a4ebc6e	R600: Fix min/max matching problems with unordered compares The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094	2014-12-12 02:30:37 +00:00
Matt Arsenault	145d5717f5	R600/SI: fmin/fmax_legacy are not associative llvm-svn: 224093	2014-12-12 02:30:33 +00:00
Matt Arsenault	477b178276	R600/SI: Don't promote f32 select to i32 This is nice for the instruction patterns, but it complicates min / max matching. The select doesn't have the correct type and would require looking through the bitcasts for the real float operands. llvm-svn: 224092	2014-12-12 02:30:29 +00:00
Duncan P. N. Exon Smith	5bd34e56ce	Bitcode: Add missing "Remove in 4.0" comments llvm-svn: 224090	2014-12-12 02:11:31 +00:00
Matthias Braun	0880c6098f	Document that PassManager::add() may delete the pass right away. Also remove redundant documentation: - doxygen will copy documentation to overriden methods. - Use \copydoc on PIMPL classes instead of replicating the text. llvm-svn: 224089	2014-12-12 01:27:01 +00:00
Philip Reames	60de8b29f7	Comment and minor code cleanup for GCStrategy (NFC) Updating comments to reflect the current state of the world after my recent changes to ownership structure and generally better describe what a GCStrategy is and how it works. llvm-svn: 224086	2014-12-12 00:49:03 +00:00
Matt Arsenault	810cb62962	Add target hook for whether it is profitable to reduce load widths Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084	2014-12-12 00:00:24 +00:00
Sanjay Patel	757942a38f	remove function names from comments; NFC llvm-svn: 224080	2014-12-11 23:38:43 +00:00
Matt Arsenault	102a70409e	R600/SI: Handle physical registers in getOpRegClass llvm-svn: 224079	2014-12-11 23:37:34 +00:00
Matt Arsenault	e368cb378f	R600/SI: Don't verify constant bus usage of flag ops This was checking if pseudo-operands like the source modifiers were using the constant bus, which happens to work because the values these all can be happen to be valid inline immediates. This fixes a later commit which starts checking the register class of the operands. llvm-svn: 224078	2014-12-11 23:37:32 +00:00
Duncan P. N. Exon Smith	eca1e031d1	Bitcode: Use unsigned char to record MDStrings `MDString`s can have arbitrary characters in them. Prevent an assertion that fired in `BitcodeWriter` because of sign extension by copying the characters into the record as `unsigned char`s. Based on a patch by Keno Fischer; fixes PR21882. llvm-svn: 224077	2014-12-11 23:34:30 +00:00
Sanjay Patel	c694ac5519	return without temporary; NFC llvm-svn: 224076	2014-12-11 23:30:36 +00:00
Matthias Braun	b2f2388a76	Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips. llvm-svn: 224075	2014-12-11 23:18:03 +00:00
Ahmed Bougacha	79c797443b	[X86] Add a temporary testcase for PR21876/r223996. llvm-svn: 224074	2014-12-11 23:07:52 +00:00
Duncan P. N. Exon Smith	5c7006e062	Bitcode: Add METADATA_NODE and METADATA_VALUE This reflects the typelessness of `Metadata` in the bitcode format, removing types from all metadata operands. `METADATA_VALUE` represents a `ValueAsMetadata`, and always has two fields: the type and the value. `METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`, doesn't store types. It stores operands at their ID+1 so that `0` can reference `nullptr` operands. Part of PR21532. llvm-svn: 224073	2014-12-11 23:02:24 +00:00
Hal Finkel	b5e9b0426a	[PowerPC] Better lowering for add/or of a FrameIndex If we have an add (or an or that is really an add), where one operand is a FrameIndex and the other operand is a small constant, we can combine the lowering of the FrameIndex (which is lowered as an add of the FI and a zero offset) with the constant operand. Amusingly, this is an old potential improvement entry from lib/Target/PowerPC/README.txt which had never been resolved. In short, we used to lower: %X = alloca { i32, i32 } %Y = getelementptr {i32,i32}* %X, i32 0, i32 1 ret i32* %Y as: addi 3, 1, -8 ori 3, 3, 4 blr and now we produce: addi 3, 1, -4 blr which is much more sensible. llvm-svn: 224071	2014-12-11 22:51:06 +00:00
Duncan P. N. Exon Smith	005f9f433c	Bitcode: Add `OLD_` prefix to metadata node records I'm about to change these, so move the old ones out of the way. Part of PR21532. llvm-svn: 224070	2014-12-11 22:30:48 +00:00
Matt Arsenault	58d502f0d4	R600/SI: Use unordered equal instructions llvm-svn: 224067	2014-12-11 22:15:43 +00:00
Matt Arsenault	8b989efaf9	R600/SI: Make more unordered comparisons legal This saves a second compare and an and / or by using the unordered comparison instructions. llvm-svn: 224066	2014-12-11 22:15:39 +00:00
Matt Arsenault	9cded7a74b	R600/SI: Use unordered not equal instructions llvm-svn: 224065	2014-12-11 22:15:35 +00:00
Alexey Samsonov	4b7f413e3e	[ASan] Change fake stack and local variables handling. This commit changes the way we get fake stack from ASan runtime (to find use-after-return errors) and the way we represent local variables: - __asan_stack_malloc function now returns pointer to newly allocated fake stack frame, or NULL if frame cannot be allocated. It doesn't take pointer to real stack as an input argument, it is calculated inside the runtime. - __asan_stack_free function doesn't take pointer to real stack as an input argument. Now this function is never called if fake stack frame wasn't allocated. - __asan_init version is bumped to reflect changes in the ABI. - new flag "-asan-stack-dynamic-alloca" allows to store all the function local variables in a dynamic alloca, instead of the static one. It reduces the stack space usage in use-after-return mode (dynamic alloca will not be called if the local variables are stored in a fake stack), and improves the debug info quality for local variables (they will not be described relatively to %rbp/%rsp, which are assumed to be clobbered by function calls). This flag is turned off by default for now, but I plan to turn it on after more testing. llvm-svn: 224062	2014-12-11 21:53:03 +00:00
Duncan P. N. Exon Smith	d6f8e4b03c	CodeGen: Stop using LeakDetector for MachineInstr Since `MachineInstr` is required to have a trivial destructor, it cannot remove itself from `LeakDetection`. Remove the calls. As it happens, this requirement is because `MachineFunction` allocates all `MachineInstr`s in a custom allocator; when the `MachineFunction` is destroyed they're dropped of the edge. There's no benefit to detecting leaks. llvm-svn: 224061	2014-12-11 21:51:37 +00:00
Duncan P. N. Exon Smith	63eb6bf623	IR: Store MDNodes in a separate LeakDetector container This gives us better leak detection messages, like `Value` has. This also has the side effect of papering over a problem where `MachineInstr`s are added as garbage to the leak detector and then deleted without being removed. If `MDNode::getTemporary()` allocates an `MDNodeFwdDecl` in the same spot, the leak detector asserts. By separating `MDNode`s into their own container we lose that assertion. Since `MachineInstr` is required to have a trivial destructor, its usage of `LeakDetector` at all is pretty suspect. I'll be sending a patch soon to strip that out. llvm-svn: 224060	2014-12-11 21:39:39 +00:00
Matthias Braun	7e37a5f523	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059	2014-12-11 21:26:47 +00:00
David Majnemer	0a14c0ec9d	AsmParser: Don't crash on an ill-formed MDNodeVector llvm-svn: 224056	2014-12-11 20:51:54 +00:00
Andrea Di Biagio	72b05aa59c	[InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi. This patch teaches the instruction combiner how to fold a call to 'insertqi' if the 'length field' (3rd operand) is set to zero, and if the sum between field 'length' and 'bit index' (4th operand) is bigger than 64. From the AMD64 Architecture Programmer's Manual: 1. If the sum of the bit index + length field is greater than 64, then the results are undefined; 2. A value of zero in the field length is defined as a length of 64. This patch improves the existing combining logic for intrinsic 'insertqi' adding extra checks to address both point 1. and point 2. Differential Revision: http://reviews.llvm.org/D6583 llvm-svn: 224054	2014-12-11 20:44:59 +00:00
David Majnemer	06f960d5d3	AsmParser: Don't crash on an ill-formed MDNodeVector llvm-svn: 224053	2014-12-11 20:44:09 +00:00
Rafael Espindola	7eb1f1856c	Remove a convoluted way of calling close by moving the call to the only caller. As a bonus we can actually check the return value. llvm-svn: 224046	2014-12-11 20:12:55 +00:00
Rafael Espindola	01c73610d0	This reverts commit r224043 and r224042. check-llvm was failing. llvm-svn: 224045	2014-12-11 20:03:57 +00:00
Michael Ilseman	4e654cd664	Silence static analyzer warnings in LLVMSupport. The static analyzer catches a few potential bugs in LLVMSupport. Add in asserts to silence the warnings. llvm-svn: 224044	2014-12-11 19:46:38 +00:00
Matthias Braun	199aeff7dd	Enable machineverifier in debug mode for X86, ARM, AArch64, Mips llvm-svn: 224043	2014-12-11 19:42:09 +00:00
Matthias Braun	a7c82a9f1d	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042	2014-12-11 19:42:05 +00:00
Matthias Braun	a4e932db16	[CodeGen] Let MachineVerifierPass own its banner string llvm-svn: 224041	2014-12-11 19:41:51 +00:00
Colin LeMahieu	150b6b3a73	[Hexagon] Renaming classes in preparation for replacement. llvm-svn: 224036	2014-12-11 19:01:28 +00:00
Tim Northover	e2c33715bc	ARM: convert isTargetIOS checks to isTargetDarwin. The distinction is mostly useful in the front-end. By the time we get here, there are very few situations where we actually want different behaviour for Darwin and IOS (in fact Darwin mostly just exists in a few tests). So this should reduce any surprising weirdness for anyone using it. No functional change on anything anyone actually cares about. llvm-svn: 224035	2014-12-11 18:49:37 +00:00
Hal Finkel	13d104bf78	[PowerPC] Implement BuildSDIVPow2, lower i64 pow2 sdiv using sradi PPCISelDAGToDAG contained existing code to lower i32 sdiv by a power-of-2 using srawi/addze, but did not implement the i64 case. DAGCombine now contains a callback specifically designed for this purpose (BuildSDIVPow2), and part of the logic has been moved to an implementation of that callback. Doing this lowering using BuildSDIVPow2 likely does not matter, compared to handling everything in PPCISelDAGToDAG, for the positive divisor case, but the negative divisor case, which generates an additional negation, can potentially benefit from additional folding from DAGCombine. Now, both the i32 and the i64 cases have been implemented. Fixes PR20732. llvm-svn: 224033	2014-12-11 18:37:52 +00:00
Rafael Espindola	71bc507c4f	Remove dead code. NFC. llvm-svn: 224029	2014-12-11 17:17:26 +00:00
Cameron McInally	5fb084e798	[AVX512] Add support for 512b variable bit shift intrinsics. llvm-svn: 224028	2014-12-11 17:13:05 +00:00
Colin LeMahieu	adab80720d	[Hexagon] Ading i64 <- i32, i32 sextw pattern. llvm-svn: 224027	2014-12-11 17:08:21 +00:00
Colin LeMahieu	eb52f69f59	[Hexagon] Adding encoding information for sign extend word instruction. llvm-svn: 224026	2014-12-11 16:43:06 +00:00
Elena Demikhovsky	908dbf48c8	AVX-512: Added all forms of COMPRESS instruction + intrinsics + tests llvm-svn: 224019	2014-12-11 15:02:24 +00:00
Jozef Kolek	a330a47427	[mips][microMIPS] Implement CodeGen support for LI16 instruction. Differential Revision: http://reviews.llvm.org/D5840 llvm-svn: 224017	2014-12-11 13:56:23 +00:00
Michael Kuperstein	fffb6996c9	The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value. Patch by Amjad Aboud Differential Revision: http://reviews.llvm.org/D6525 llvm-svn: 224015	2014-12-11 12:41:10 +00:00
Michael Kuperstein	11165674dc	[X86] When converting movs to pushes, don't assume MOVmi operand is an actual immediate This should fix PR21878. llvm-svn: 224010	2014-12-11 11:26:16 +00:00
Patrik Hagglund	cb06a36c9a	Bugfix in InlineSpiller::traceSiblingValue(). Properly determine whether or not a phi was added by splitting. Check against the current VNInfo of OrigLI instead of against the OrigVNI argument. Patch provided by Jonas Paulsson. Reviewed by Quentin Colombet. llvm-svn: 224009	2014-12-11 10:40:17 +00:00
Elena Demikhovsky	fc081457f1	AVX-512: Fixed a bug in lowering setcc for MVT::i1 type llvm-svn: 224008	2014-12-11 10:21:12 +00:00
Kumar Sukhani	fb60e77fcc	test commit (spelling correction) llvm-svn: 224007	2014-12-11 08:33:36 +00:00
Erik Eckstein	096ff7dcd6	Refactor creation of overflow result tuples in InstCombineCalls. Extract the creation of overflow result tuples in a separate function. NFC. llvm-svn: 224006	2014-12-11 08:02:30 +00:00
Craig Topper	5fcc5abc1f	Use range-based for loops. NFC llvm-svn: 224005	2014-12-11 07:04:54 +00:00
Ekaterina Romanova	75fd123967	Reverting commit 223981, because the test that I added (incorrect-variable-debugloc1.ll) failed for llvm-ppc64. The test is failing for llvm-ppc64 because for this platform the location list is not being generated at all (most likely because of the bug in PPC code optimization or generation). I will file a bug agains PPC compiler, but meanwhile, until PPC bug is fixed, I will have to revert my change. llvm-svn: 224000	2014-12-11 06:22:35 +00:00
Craig Topper	c3504c4874	Make MultiClass::DefPrototypes own their Records to fix memory leaks. llvm-svn: 223998	2014-12-11 05:25:33 +00:00
Craig Topper	7adf2bf76a	Replace std::map<K, V*> with std::map<K, std::unique_ptr<V>> to handle ownership and deletion of the values. Ideally we would store the MultiClasses by value directly in the maps, but I had some trouble with that before and this at least fixes the leak. llvm-svn: 223997	2014-12-11 05:25:30 +00:00
Ahmed Bougacha	611a3ef0bc	[X86] Add back AVX2 VR256 PMOVX patterns. We can't reach those from zext, but other parts of the backend (the shuffle lowering) generate 256-bit VZEXT nodes. Fixes PR21876. llvm-svn: 223996	2014-12-11 04:32:17 +00:00
Nick Lewycky	dde76ff09c	Fix LLVMContext to match what MDKind names that the LL parser permits. Fixes PR21799! llvm-svn: 223995	2014-12-11 02:10:28 +00:00
Philip Reames	1e30897497	GCStrategy should not own GCFunctionInfo This change moves the ownership and access of GCFunctionInfo (the object which describes the safepoints associated with a safepoint under GCRoot) to GCModuleInfo. Previously, this was owned by GCStrategy which was in turned owned by GCModuleInfo. This made GCStrategy module specific which is 'surprising' given it's name and other purposes. There's a few more changes needed, but we're getting towards the point we can reuse GCStrategy for gc.statepoint as well. p.s. The style of this code ends up being a mess. I was trying to move code around without otherwise changing much. Once I get the ownership structure rearranged, I will go through and fixup spacing, naming, comments etc. Differential Revision: http://reviews.llvm.org/D6587 llvm-svn: 223994	2014-12-11 01:47:23 +00:00
Matthias Braun	09afa1ea74	LiveInterval: Use range based for loops for subregister ranges. llvm-svn: 223991	2014-12-11 00:59:06 +00:00
Tim Northover	2ac7e4b3ee	ARM: correctly expand LDR-lit based globals. Quite a major error here: the expansions for the Pseudos with and without folded load were mixed up. Fortunately it only affects ARM-mode, when not using movw/movt, on Darwin. I'm guessing no-one actually uses that combination. llvm-svn: 223986	2014-12-10 23:40:50 +00:00
Ekaterina Romanova	ceeaba7932	A fix for PR21176. DW_OP_const <const> doesn't describe a constant value, but a value at a constant address. The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value. Added DW_OP_stack_value to the stack. -This line, and those below, will be ignored-- M lib/CodeGen/AsmPrinter/DwarfDebug.cpp A test/DebugInfo/incorrect-variable-debugloc1.ll llvm-svn: 223981	2014-12-10 23:19:56 +00:00
Matthias Braun	96761959d4	LiveInterval: Use more range based for loops for value numbers and segments. llvm-svn: 223978	2014-12-10 23:07:54 +00:00
Mark Heffernan	41d7656d5a	Fix PR21694. r219517 added a use of SCEV divide in HowFarToZero computation. This divide can produce incorrect results as we are using an unsigned divide for what should be a modular divide. This change reverts back to a more conservative computation using trailing zeros. llvm-svn: 223974	2014-12-10 22:53:52 +00:00
Colin LeMahieu	220adb6370	[Hexagon] Adding combine ri/ir instructions. llvm-svn: 223971	2014-12-10 22:23:07 +00:00
David Majnemer	3729668056	ConstantFold: Clean up X * undef code No functional change intended. llvm-svn: 223970	2014-12-10 21:58:17 +00:00
David Majnemer	5a7717e498	ConstantFold, InstSimplify: undef >>a x can be either -1 or 0, choose 0 Zero is usually a nicer constant to have than -1. llvm-svn: 223969	2014-12-10 21:58:15 +00:00
David Majnemer	89cf6d79eb	ConstantFold: an undef shift amount results in undef X shifted by undef results in undef because the undef value can represent values greater than the width of the operands. llvm-svn: 223968	2014-12-10 21:38:05 +00:00
Colin LeMahieu	db0b13cef0	[Hexagon] Adding encodings for JR class instructions. Updating complier usages. llvm-svn: 223967	2014-12-10 21:24:10 +00:00
Rafael Espindola	0e77a94fd6	Move three methods only used by MCJIT to MCJIT. These methods are only used by MCJIT and are very specific to it. In fact, they are also fairly specific to the fact that we have a dynamic linker of relocatable objects. llvm-svn: 223964	2014-12-10 20:46:55 +00:00
Juergen Ributzka	2326650ceb	[AArch64] MachO large code-model: Materialize FP constants in code. In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941	2014-12-10 19:43:32 +00:00
Marek Olsak	0c05645b0f	R600/SI: Use getTargetConstant in AdjustRegClass llvm-svn: 223940	2014-12-10 19:25:31 +00:00
Colin LeMahieu	8872d20788	[Hexagon] Adding JR class predicated call reg instructions. llvm-svn: 223933	2014-12-10 18:24:16 +00:00
Sanjay Patel	e20437f9af	Match new shuffle codegen for MOVHPD patterns Add patterns to match SSE (shufpd) and AVX (vpermilpd) shuffle codegen when storing the high element of a v2f64. The existing patterns were only checking for an unpckh type of shuffle. http://llvm.org/bugs/show_bug.cgi?id=21791 Differential Revision: http://reviews.llvm.org/D6586 llvm-svn: 223929	2014-12-10 16:58:54 +00:00
Aaron Ballman	e5a2a0c9a8	Silencing a -Wsequence-point warning, and the resulting undefined behavior. NFC. llvm-svn: 223926	2014-12-10 14:14:54 +00:00
David Majnemer	7b86b77248	ConstantFold: div undef, 0 should fold to undef, not zero Dividing by zero yields an undefined value. llvm-svn: 223924	2014-12-10 09:14:55 +00:00
David Majnemer	ae707582c0	InstSimplify: [al]shr exact undef, %X -> undef Exact shifts always keep the non-zero bits of their input. This means it keeps it's undef bits. llvm-svn: 223923	2014-12-10 09:14:52 +00:00
Michael Kuperstein	0104ff6529	[X86] Make a code path in EltsFromConsecutiveLoads work only on vectors it expects EltsFromConsecutiveLoads was apparently only ever called for 128-bit vectors, and assumed this implicitly. r223518 started calling it for AVX-sized vectors, causing the code path that had this assumption to crash. This adds a check to make this path fire only for 128-bit vectors. Differential Revision: http://reviews.llvm.org/D6579 llvm-svn: 223922	2014-12-10 08:46:12 +00:00
David Majnemer	71dc8fb867	InstSimplify: div %X, 0 -> undef We already optimized rem %X, 0 to undef, we should do the same for div. llvm-svn: 223919	2014-12-10 07:52:18 +00:00
David Majnemer	612f31284e	DataLayout: Provide nicer diagnostics for malformed strings llvm-svn: 223911	2014-12-10 02:36:41 +00:00
David Majnemer	6f3be2e155	AsmParser: Don't allow null bytes in BB labels Since Value objects can't have null bytes in their name, we shouldn't allow them in the labels of basic blocks. llvm-svn: 223907	2014-12-10 02:10:35 +00:00
Duncan P. N. Exon Smith	f14b1df55b	IR: Move call to dropAllReferences() to MDNode subclasses Don't call `dropAllReferences()` from `MDNode::~MDNode()`, call it directly from `~MDNodeFwdDecl()` and `~GenericMDNode()`. llvm-svn: 223904	2014-12-10 01:45:04 +00:00
David Majnemer	2dc1b0f514	DataLayout: Be more verbose when diagnosing problems in pointer specs llvm-svn: 223903	2014-12-10 01:38:28 +00:00
David Majnemer	5330c69bd1	DataLayout: Move asserts over to report_fatal_error As indicated by the tests, it is possible to feed the AsmParser an invalid datalayout string. We should verify the result of parsing this string regardless of whether or not we have assertions enabled. llvm-svn: 223898	2014-12-10 01:17:08 +00:00
Matthias Braun	96d7732b08	MachineVerifier: Allow physreg use if just a subreg is defined. We can't mark partially undefined registers, so we have to allow reading a register in the machine verifier if just parts of a register are defined. llvm-svn: 223896	2014-12-10 01:13:13 +00:00
Matthias Braun	21554d9b30	MachineVerifier: Allow LiveInterval segments to end at a partial write. In the subregister liveness tracking case we do not create implicit reads on partial register writes anymore, still we need to produce a new SSA value for partial writes so the live segment has to end. llvm-svn: 223895	2014-12-10 01:13:11 +00:00
Matthias Braun	279f83645c	VirtRegMap: Improve block live-in info if subregister liveness is available. llvm-svn: 223894	2014-12-10 01:13:08 +00:00
Matthias Braun	d70caaf5a5	VirtRegMap: No implicit defs/uses for super registers with subreg liveness tracking. Adding the implicit defs/uses to the superregisters is semantically questionable but was not dangerous before as the register allocator never assigned the same register to two overlapping LiveIntervals even when the actually live subregisters do not overlap. With subregister liveness tracking enabled this does actually happen and leads to subsequent bugs if we don't stop adding the superregister defs/uses. llvm-svn: 223892	2014-12-10 01:13:04 +00:00
Matthias Braun	587e27415d	LiveRegMatrix: Respect subregister liveness when allocating registers. llvm-svn: 223891	2014-12-10 01:13:01 +00:00
Matthias Braun	a0f0c1f013	LiveIntervalUnion: Allow specification of liverange when unifying/extracting. This allows it to add subregister ranges into the union. llvm-svn: 223890	2014-12-10 01:12:59 +00:00
Matthias Braun	14f764c872	RegisterCoalescer: Preserve subregister liveranges. llvm-svn: 223888	2014-12-10 01:12:52 +00:00
Matthias Braun	2079aa9140	LiveInterval: Add removeEmptySubRanges(). llvm-svn: 223887	2014-12-10 01:12:40 +00:00
Matthias Braun	8970d847c4	LiveIntervalAnalysis: Add subregister aware variants pruneValue(). llvm-svn: 223886	2014-12-10 01:12:36 +00:00
Matthias Braun	e3d3b88cb9	Add a flag to enable/disable subregister liveness. llvm-svn: 223884	2014-12-10 01:12:30 +00:00
Matthias Braun	e5f861b781	LiveIntervalAnalysis: Adapt repairIntervalsInRange() to subregister liveness. llvm-svn: 223883	2014-12-10 01:12:26 +00:00
Matthias Braun	fe896c703c	LiveRangeEdit: Adapt eliminateDeadDef() to subregister liveness. llvm-svn: 223882	2014-12-10 01:12:23 +00:00
Matthias Braun	7044d69e87	LiveIntervalAnalysis: Adapt handleMove() to subregister ranges. llvm-svn: 223881	2014-12-10 01:12:20 +00:00
Matthias Braun	20e1f38a41	LiveIntervalAnalysis: Update SubRanges in shrinkToUses(). llvm-svn: 223880	2014-12-10 01:12:18 +00:00
Matthias Braun	2f66232bde	LiveIntervalAnalysis: Compute subregister ranges. llvm-svn: 223878	2014-12-10 01:12:12 +00:00
Matthias Braun	3f1d8fdd33	LiveInterval: Add support to track liveness of subregisters. This code adds the required data structures. Algorithms to compute it follow. llvm-svn: 223877	2014-12-10 01:12:10 +00:00
Matthias Braun	e62c207092	LiveInterval: Add a 'covers' operation to LiveRange. llvm-svn: 223876	2014-12-10 01:12:06 +00:00
David Majnemer	1d681aa0ba	AsmParser: Don't crash if a null byte is inside a quoted string We don't allow Value* to have names which contain null bytes. The AsmParser should reject .ll files that try to do this. llvm-svn: 223869	2014-12-10 00:43:17 +00:00
Ahmed Bougacha	7efbac74ec	[ARM] Combine base-updating/post-incrementing vector load/stores. We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862	2014-12-10 00:07:37 +00:00
Philip Reames	de226055ca	Remove the Module pointer from GCStrategy and GCMetadataPrinter In the current implementation, GCStrategy is a part of the ownership structure for the gc metadata which describes a Module. It also contains a reference to the module in question. As a result, GCStrategy instances are essentially Module specific. I plan to transition away from this design. Instead, a GCStrategy will be owned by the LLVMContext. It will be a lightweight policy object which contains no information about the Modules or Functions involved, but can be easily reached given a Function. The first step in this transition is to remove the direct Module reference from GCStrategy. This also requires removing the single user of this reference, the GCMetadataPrinter hierarchy. In theory, this will allow the lifetime of the printers to be scoped to the LLVMContext as well, but in practice, I'm not actually changing that. (Yet?) An alternate design would have been to move the direct Module reference into the GCMetadataPrinter and change the keying of the owning maps to explicitly key off both GCStrategy and Module. I'm open to doing it that way instead, but didn't see much value in preserving the per Module association for GCMetadataPrinters. The next change in this sequence will be to start unwinding the intertwined ownership between GCStrategy, GCModuleInfo, and GCFunctionInfo. Differential Revision: http://reviews.llvm.org/D6566 llvm-svn: 223859	2014-12-09 23:57:54 +00:00
Duncan P. N. Exon Smith	22600ff328	IR: Fix memory corruption in MDNode new/delete There were two major problems with `MDNode` memory management. 1. `MDNode::operator new()` called a placement array constructor for `MDOperand`. What? Each operand needs to be placed individually. 2. `MDNode::operator delete()` failed to destruct the `MDOperand`s at all. Frankly it's hard to understand how this worked locally, how this survived an LTO bootstrap, or how it worked on most of the bots. llvm-svn: 223858	2014-12-09 23:56:39 +00:00
David Majnemer	aa5d70764f	AsmParser: Verifier that the contents of a hex integer are hex llvm-svn: 223856	2014-12-09 23:50:38 +00:00
Kaelyn Takata	22324f378a	Rename static functiom "map" to be more descriptive and to avoid potential confusion with the std::map type. llvm-svn: 223853	2014-12-09 23:32:46 +00:00
Duncan P. N. Exon Smith	d167eac023	IR: Metadata: Detect an RAUW recursion Speculatively handle a recursion in `GenericMDNode::handleChangedOperand()`. I'm hoping this fixes the failing hexagon bot [1]. [1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/13434 llvm-svn: 223849	2014-12-09 23:04:59 +00:00
Michael Zolotukhin	4def395646	Remove redundant variable. Tested by adding assert(LoopVectorPreHeader == VecPreheader) on LLVM test suite and SPECs. llvm-svn: 223847	2014-12-09 22:45:07 +00:00
Colin LeMahieu	b32bf14c2a	[Hexagon] [NFC] Cleaning up unused classes. llvm-svn: 223845	2014-12-09 22:33:26 +00:00
Ahmed Bougacha	b31fba1613	[ARM] Factor out base-updating VLD/VST combiner function. NFC. Move the combiner-state check into another function, add a few small comments, and use a more general type in a cast<>. In preparation for a future patch. llvm-svn: 223834	2014-12-09 21:30:00 +00:00
Ahmed Bougacha	2316746e40	[ARM] Move the store combiner function down. NFC. And flip its final condition. In preparation for a future patch. llvm-svn: 223833	2014-12-09 21:26:53 +00:00
Ahmed Bougacha	be0b227679	[ARM] Also support v2f64 vld1/vst1. It was missing from the VLD1/VST1 handling logic, even though the corresponding instructions exist (same form as v2i64). In preparation for a future patch. llvm-svn: 223832	2014-12-09 21:25:00 +00:00
Duncan P. N. Exon Smith	21909e35cb	IR: Metadata/Value split: RAUW in a deterministic order RAUW in a deterministic order to try to recover the hexagon bot [1], whose tests started failing once my GCC fixes were in for r223802. Otherwise, I'm not sure why tests would fail there and not here. [1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/13426 llvm-svn: 223829	2014-12-09 21:12:56 +00:00
Rafael Espindola	0bfe828f7a	Return ErrorOr<std::unique_ptr<Archive>> form getAsArchive. This is the same return type of Archive::create. llvm-svn: 223827	2014-12-09 21:05:36 +00:00
Hans Wennborg	e242e8b064	Try fixing MSVC build after r223802 LLVM_EXPLICIT is only supported by recent version of MSVC, and it seems the not-so-recent versions get confused about the operator bool() when tryint to resolve operator== calls. This removed the operator bool()'s since they don't seem to be used anyway. llvm-svn: 223824	2014-12-09 20:39:15 +00:00
Colin LeMahieu	b030c254c0	[Hexagon] Fixing broken tests. llvm-svn: 223823	2014-12-09 20:36:53 +00:00
Rafael Espindola	5dec7eaae2	Rename createIRObjectFile to just create. It is a static method of IRObjectFile, so having to use IRObjectFile::createIRObjectFile was redundant. llvm-svn: 223822	2014-12-09 20:36:13 +00:00
Colin LeMahieu	4af437fee5	[Hexagon] Updating rr/ri 32/64 transfer encodings and adding tests. llvm-svn: 223821	2014-12-09 20:23:30 +00:00
Juergen Ributzka	c6f314b8ed	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'. The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818	2014-12-09 19:44:38 +00:00
Colin LeMahieu	b580d7d8c8	[Hexagon] Adding word combine dot-new form and replacing old combine opcode. llvm-svn: 223815	2014-12-09 19:23:45 +00:00
Chandler Carruth	a7f247ea56	Revert r223764 which taught instcombine about integer-based elment extraction patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813	2014-12-09 19:21:16 +00:00
David Majnemer	2defbada38	AsmParser: Don't crash on short hex constants for fp128 types If we see 0xL01, treat it like 0xL00000000000000000000000000000001 instead of crashing. llvm-svn: 223811	2014-12-09 19:10:03 +00:00
Frederic Riss	35f0a9aeba	Remove unneeded curly braces. llvm-svn: 223809	2014-12-09 18:57:39 +00:00
Frederic Riss	ff58fd207e	Reorder the code to avoid inserting at the beginning of a vector. As per dblaikie suggestion, thanks\! llvm-svn: 223808	2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith	562283189d	Fix a GCC build failure from r223802 llvm-svn: 223806	2014-12-09 18:52:38 +00:00
Robert Khasanov	8e8c39963d	[AVX512] Added lowering for VBROADCASTSS/SD instructions. Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes. Added lowering tests. llvm-svn: 223804	2014-12-09 18:45:30 +00:00
Duncan P. N. Exon Smith	5bf8fef580	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
David Majnemer	b39e22bdc5	AsmParser: Don't crash on malformed attribute groups This fixes PR21785. llvm-svn: 223801	2014-12-09 18:33:57 +00:00
Colin LeMahieu	30dcb232b0	[Hexagon] Updating predicate register transfers and adding tstbit to allow select selection. Updating ll tests with predicate transfers that previously had nop encodings. llvm-svn: 223800	2014-12-09 18:16:49 +00:00
Frederic Riss	7c78db5065	Correctly handle complex locations expressions in replaceDbgDeclareForAlloca() replaceDbgDeclareForAlloca() replaces an alloca by a value storing the address of what was the alloca. If there is a dbg.declare corresponding to that alloca, we need to lower it to a dbg.value describing the additional dereference operation to be performed to get to the underlying variable. This is done by adding a DW_OP_deref to the complex location part of the location description. This deref was added to the end of the operation list, which is wrong. The expression applies to what is described by the dbg.{declare,value}, and as we are changing this, we need to apply the DW_OP_deref as the first operation in the list. Part of the fix for rdar://19162268. llvm-svn: 223799	2014-12-09 17:55:48 +00:00
Juergen Ributzka	8bda738221	[CGP] Rewrite pattern match for splitBranchCondition to work with Values instead. Rewrite the pattern match code to work also with Values instead with Instructions only. Also remove the no longer need matcher (m_Instruction). llvm-svn: 223797	2014-12-09 17:50:10 +00:00
Juergen Ributzka	194350a936	Revert "Move function to obtain branch weights into the BranchInst class. NFC." This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare. llvm-svn: 223795	2014-12-09 17:32:12 +00:00
Bill Schmidt	efe9ce216e	[PowerPC 4/4] Enable little-endian support for VSX. With the foregoing three patches, VSX instructions can be used for little endian. This patch removes the restriction that prevented this, and re-enables the test cases from the first three patches. llvm-svn: 223792	2014-12-09 16:59:57 +00:00
Bill Schmidt	3014435ca9	[PowerPC 3/4] Little-endian adjustments for VSX vector shuffle When performing instruction selection for ISD::VECTOR_SHUFFLE, there is special code for handling v2f64 and v2i64 using VSX instructions. This code must be adjusted for little-endian. Because the two inputs are treated as a double-wide register, we must swap their order for little endian. To get the appropriate mask elements to use with the big-endian biased XXPERMDI instruction, we must reverse their order and invert the bits. A new test is added to test the 16 possible values of the shuffle mask. It is initially disabled for reasons specified in the test. It is re-enabled by patch 4/4. llvm-svn: 223791	2014-12-09 16:52:29 +00:00
Bill Schmidt	10f6eb91a0	[PowerPC 2/4] Little-endian adjustments for VSX insert/extract operations For little endian, we need to make some straightforward adjustments in the code expansions for scalar_to_vector and vector_extract of v2f64. First, scalar_to_vector must place the scalar into vector element zero. However, our implementation of SUBREG_TO_REG will place it into big-element vector element zero (high-order bits), and for little endian we need it in the low-order bits. The LE implementation splats the high-order doubleword into the low-order doubleword. Second, the meaning of (vector_extract x, 0) and (vector_extract x, 1) must be reversed for similar reasons. A new test is added that tests code generation for insertelement and extractelement for both element 0 and element 1. It is disabled in this patch but enabled in patch 4/4, for reasons stated in the test. llvm-svn: 223788	2014-12-09 16:43:32 +00:00
Robert Khasanov	cbc5703aeb	[AVX512] Added VPBROADCAST{BWDQ} (Load with Broadcast Integer Data from General Purpose Register) encodings for AVX512-BW/VL subsets Added encoding tests. llvm-svn: 223787	2014-12-09 16:38:41 +00:00
Juergen Ributzka	c1bbcbbd32	[CodeGenPrepare] Split branch conditions into multiple conditional branches. This optimization transforms code like: bb1: %0 = icmp ne i32 %a, 0 %1 = icmp ne i32 %b, 0 %or.cond = or i1 %0, %1 br i1 %or.cond, label %TrueBB, label %FalseBB into a multiple branch instructions like: bb1: %0 = icmp ne i32 %a, 0 br i1 %0, label %TrueBB, label %bb2 bb2: %1 = icmp ne i32 %b, 0 br i1 %1, label %TrueBB, label %FalseBB This optimization is already performed by SelectionDAG, but not by FastISel. FastISel cannot perform this optimization, because it cannot generate new MachineBasicBlocks. Performing this optimization at CodeGenPrepare time makes it available to both - SelectionDAG and FastISel - and the implementation in SelectiuonDAG could be removed. There are currenty a few differences in codegen for X86 and PPC, so this commmit only enables it for FastISel. Reviewed by Jim Grosbach This fixes rdar://problem/19034919. llvm-svn: 223786	2014-12-09 16:36:13 +00:00
Juergen Ributzka	e2aa3aa38a	Move function to obtain branch weights into the BranchInst class. NFC. Make this function available to other parts of LLVM. llvm-svn: 223784	2014-12-09 16:36:06 +00:00
Bill Schmidt	fae5d71584	[PowerPC 1/4] Little-endian adjustments for VSX loads/stores This patch addresses the inherent big-endian bias in the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. These instructions load vector elements into registers left-to-right (with the first element loaded into the high-order bits of the register), regardless of the endian setting of the processor. However, these are the only vector memory instructions that permit unaligned storage accesses, so we want to use them for little-endian. To make this work, a lxvd2x or lxvw4x is replaced with an lxvd2x followed by an xxswapd, which swaps the doublewords. This works for lxvw4x as well as lxvd2x, because for lxvw4x on an LE system the vector elements are in LE order (right-to-left) within each doubleword. (Thus after lxvw2x of a <4 x float> the elements will appear as 1, 0, 3, 2. Following the swap, they will appear as 3, 2, 0, 1, as desired.) For stores, an stxvd2x or stxvw4x is replaced with an stxvd2x preceded by an xxswapd. Introduction of extra swap instructions provides correctness, but obviously is not ideal from a performance perspective. Future patches will address this with optimizations to remove most of the introduced swaps, which have proven effective in other implementations. The introduction of the swaps is performed during lowering of LOAD, STORE, INTRINSIC_W_CHAIN, and INTRINSIC_VOID operations. The latter are used to translate intrinsics that specify the VSX loads and stores directly into equivalent sequences for little endian. Thus code that uses vec_vsx_ld and vec_vsx_st does not have to be modified to be ported from BE to LE. We introduce new PPCISD opcodes for LXVD2X, STXVD2X, and XXSWAPD for use during this lowering step. In PPCInstrVSX.td, we add new SDType and SDNode definitions for these (PPClxvd2x, PPCstxvd2x, PPCxxswapd). These are recognized during instruction selection and mapped to the correct instructions. Several tests that were written to use -mcpu=pwr7 or pwr8 are modified to disable VSX on LE variants because code generation changes with this and subsequent patches in this set. I chose to include all of these in the first patch than try to rigorously sort out which tests were broken by one or another of the patches. Sorry about that. The new test vsx-ldst-builtin-le.ll, and the changes to vsx-ldst.ll, are disabled until LE support is enabled because of breakages that occur as noted in those tests. They are re-enabled in patch 4/4. llvm-svn: 223783	2014-12-09 16:35:51 +00:00
Rafael Espindola	25a7e0a89f	Move method out of line to make buildbot happy. llvm-svn: 223781	2014-12-09 16:18:11 +00:00
Rafael Espindola	527e846ef7	Don't lookup an object symbol name in the module. Instead, walk the obj symbol list in parallel to find the GV. This shouldn't change anything on ELF where global symbols are not mangled, but it is a step toward supporting other object formats. Gold itself is ELF only, but bfd ld supports COFF and the logic in the gold plugin could be reused on lld. llvm-svn: 223780	2014-12-09 16:13:59 +00:00
Chandler Carruth	f57ac3bd22	[x86] Fix the test to actually test things for the CPU names, add the missing barcelona CPU which that test uncovered, and remove the 32-bit x86 CPUs which I really wasn't prepared to audit and test thoroughly. If anyone wants to clean up the 32-bit only x86 CPUs, go for it. Also, if anyone else wants to try to de-duplicate the AMD CPUs, that'd be cool, but from the looks of it wouldn't save as much as it did for the Intel CPUs. llvm-svn: 223774	2014-12-09 14:25:55 +00:00
Aaron Ballman	f588251b99	Removing an unused variable to silence a -Wunused-but-set-variable warning. NFC. llvm-svn: 223773	2014-12-09 13:20:11 +00:00
Asiri Rathnayake	7835e9b232	Fix modified immediate bug reported by MC Hammer. Instructions of the form [ADD Rd, pc, #imm] are manually aliased in processInstruction() to use ADR. To accomodate this, mod_imm handling had to be tweaked a bit. Turns out it was the manual aliasing that must be tweaked to accommodate mod_imms instead. More information about the parsed instruction is available at the point where processInstruction() is invoked, which makes it easier to detect a mod_imm at that point rather than trying to detect a potential alias when a mod_imm is being prepped. Added a test case and fixed some white spaces as well. llvm-svn: 223772	2014-12-09 13:14:58 +00:00
Chandler Carruth	af892403c2	[x86] Bring some sanity to the x86 CPU processor definitions. Notably, this adds simple micro-architecture names for the Intel CPU variants, and defines the old 'core'-based names as aliases. GCC has started to simplify their documented interface to use these names as well, so it seems like we can start to converge on a consistent pattern. I'd appreciate Intel double checking the entries that aren't yet documented widely, especially Atom (Bonnell and Silvermont), Knights Landing, and Skylake. But this change shouldn't break any existing users. Also, ran clang-format to re-format this code and it actually worked (modulo a tiny bug) so hopefully we can start to stop thinking about formatting this stuff. llvm-svn: 223769	2014-12-09 10:58:36 +00:00
Chandler Carruth	7415205113	Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764	2014-12-09 08:55:32 +00:00
Michael Ilseman	2770c2d6d4	Skip declarations in the case of functions. This is a revert of r223521 in spirit, if not in content. I am not sure why declarations ended up in LazilyLinkGlobalValues in the first place; that will take some more investigation. llvm-svn: 223763	2014-12-09 08:20:06 +00:00
Elena Demikhovsky	fa4a6c18f7	AVX-512: Added some comments to ERI scalar intrinsics. No functional change. llvm-svn: 223761	2014-12-09 07:06:32 +00:00
Owen Anderson	558012a3fc	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64. llvm-svn: 223760	2014-12-09 06:50:39 +00:00
Mohit K. Bhakkad	e38c32ffec	test commit (spelling correction) llvm-svn: 223758	2014-12-09 06:31:07 +00:00
Michael Kuperstein	c69bb43f35	[X86] Convert esp-relative movs of function arguments into pushes, step 1 This handles the simplest case for mov -> push conversion: 1. x86-32 calling convention, everything is passed through the stack. 2. There is no reserved call frame. 3. Only registers or immediates are pushed, no attempt to combine a mem-reg-mem sequence into a single PUSHmm. Differential Revision: http://reviews.llvm.org/D6503 llvm-svn: 223757	2014-12-09 06:10:44 +00:00
David Majnemer	598bd05bd7	Reland r223754 The commit is identical except a reference to `GV' should have been to `GVal'. llvm-svn: 223756	2014-12-09 05:56:09 +00:00
David Majnemer	8d3e580cc7	Revert "AsmParser: Reject invalid mismatch between forward ref and def" This reverts commit r223754. I've upset the buildbots. llvm-svn: 223755	2014-12-09 05:50:11 +00:00
David Majnemer	e9efecaa52	AsmParser: Reject invalid mismatch between forward ref and def Don't assume that the forward referenced entity was of the same global-kind as the new entity. This fixes PR21779. llvm-svn: 223754	2014-12-09 05:43:56 +00:00
Bill Schmidt	0913500021	Restore r223709 as it was meant to be, and enable FeatureP8Vector for P8 llvm-svn: 223751	2014-12-09 03:02:48 +00:00
NAKAMURA Takumi	cc4487eb8b	Revert r223709, "[PowerPC]Activate FeatureVSX for the Power target", to unbreak bots. CodeGen/PowerPC/vsx-p8.ll was failing. '+power8-vector' is not a recognized feature for this target (ignoring feature) llvm/test/CodeGen/PowerPC/vsx-p8.ll:33:14: error: expected string not found in input ; CHECK-REG: lxvw4x 34, 0, 3 ^ <stdin>:50:2: note: scanning from here .align 3 ^ <stdin>:61:2: note: possible intended match here lvx 3, 0, 3 ^ llvm-svn: 223729	2014-12-09 01:03:27 +00:00
Hal Finkel	c8cf2b88bc	Handle early-clobber registers in the aggressive anti-dep breaker The aggressive anti-dep breaker, used by the PowerPC backend during post-RA scheduling (but is available to all targets), did not handle early-clobber MI operands (at all). When constructing the list of available registers for the replacement of some def operand, check the using instructions, and remove registers assigned to early-clobbered defs from the set. Fixes PR21452. llvm-svn: 223727	2014-12-09 01:00:59 +00:00
Tom Stellard	3e41dc419c	R600/SI: Set MayStore = 0 on MUBUF loads llvm-svn: 223722	2014-12-09 00:03:54 +00:00
Tom Stellard	3260ec41cf	R600/SI: Move setting of the lds bit to the base MUBUF class llvm-svn: 223721	2014-12-09 00:03:51 +00:00
Colin LeMahieu	5cf5632696	[Hexagon] Removing old def versions and replacing usages with versions that have encodings. llvm-svn: 223720	2014-12-08 23:55:43 +00:00
Tom Stellard	3e01d47d98	MISched: Fix moving stores across barriers This fixes an issue with ScheduleDAGInstrs::buildSchedGraph where stores without an underlying object would not be added as a predecessor to the current BarrierChain. llvm-svn: 223717	2014-12-08 23:36:48 +00:00
Colin LeMahieu	f5b4d655d2	[Hexagon] Adding any8, all8, and/or/xor/andn/orn/not predicate register forms, mask, and vitpack instructions and patterns. llvm-svn: 223710	2014-12-08 23:07:59 +00:00
Bill Seurer	05663d8589	[PowerPC]Activate FeatureVSX for the Power target This change activates FeatureVSX for Power 7 and Power 8 in PPC.td. http://reviews.llvm.org/D6570 llvm-svn: 223709	2014-12-08 23:07:12 +00:00
Hal Finkel	aa10b3caaf	[PowerPC] Don't use a non-allocatable register to implement the 'cc' alias GCC accepts 'cc' as an alias for 'cr0', and we need to do the same when processing inline asm constraints. This had previously been implemented using a non-allocatable register, named 'cc', that was listed as an alias of 'cr0', but the infrastructure does not seem to support this properly (neither the register allocator nor the scheduler properly accounts for the alias). Instead, we can just process this as a naming alias inside of the inline asm constraint-processing code, so we'll do that instead. There are two regression tests, one where the post-RA scheduler did the wrong thing with the non-allocatable alias, and one where the register allocator did the wrong thing. Fixes PR21742. llvm-svn: 223708	2014-12-08 22:54:22 +00:00
Colin LeMahieu	b6c4dd96f9	[Hexagon] Adding xtype doubleword add, sub, and, or, xor and patterns. llvm-svn: 223702	2014-12-08 22:19:14 +00:00
Colin LeMahieu	9bfe5473da	[Hexagon] Adding xtype doubleword comparisons. Removing unused multiclass. llvm-svn: 223701	2014-12-08 21:56:47 +00:00
Colin LeMahieu	025f860638	[Hexagon] Adding xtype parity, min, minu, max, maxu instructions. llvm-svn: 223693	2014-12-08 21:19:18 +00:00
Colin LeMahieu	8d1376c60e	[Hexagon] Adding xtype halfword add/sub ll/hl/lh/hh/sat/<<16 instructions. llvm-svn: 223692	2014-12-08 20:33:01 +00:00
Matt Arsenault	13bd95bbc7	R600/SI: Move continue after checking s_mov_b32. There's nothing else to bother trying to shrink these. llvm-svn: 223686	2014-12-08 19:55:43 +00:00
David Majnemer	770fd82f39	ConstantFold: Zero-sized globals might land on top of another global A zero sized array is zero sized and might share its address with another global. llvm-svn: 223684	2014-12-08 19:35:31 +00:00
Rafael Espindola	ef23711eee	Lazily link GlobalVariables and GlobalAliases. We were already lazily linking functions, but all GlobalValues can be treated uniformly for this. The test updates are to ensure that a given GlobalValue is still linked in. This fixes pr21494. llvm-svn: 223681	2014-12-08 18:45:16 +00:00
Colin LeMahieu	cc46cd8eec	[Hexagon] Adding add/sub with saturation. Removing unused def. Cleaning up shift patterns. llvm-svn: 223680	2014-12-08 18:33:49 +00:00
David Majnemer	d5b3aa49ac	InstSimplify: Try to bring back the rest of r223583 This reverts r223624 with a small tweak, hopefully this will make stage3 equivalent. llvm-svn: 223679	2014-12-08 18:30:43 +00:00
Bruno Cardoso Lopes	27de9b0f70	[CompactUnwind] Fix register encoding logic Fix a compact unwind encoding logic bug which would try to encode more callee saved registers than it should, leading to early bail out in the encoding logic and abusive use of DWARF frame mode unnecessarily. Also remove no-compact-unwind.ll which was testing the wrong thing based on this bug and move it to valid 'compact unwind' tests. Added other few more tests too. llvm-svn: 223676	2014-12-08 18:18:32 +00:00
Rafael Espindola	beadd56a7d	Don't crash when the key of a comdat is lazily linked. llvm-svn: 223673	2014-12-08 18:05:48 +00:00
Justin Bogner	61ba2e3996	InstrProf: An intrinsic and lowering for instrumentation based profiling Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writing counters for profiling, as in clang's ``-fprofile-instr-generate``. The implementation of the instrprof pass is ported directly out of the CodeGenPGO classes in clang, and with the followup in clang that rips that code out to use these new intrinsics this ends up being NFC. Doing the instrumentation this way opens some doors in terms of improving the counter performance. For example, this will make it simple to experiment with alternate lowering strategies, and allows us to try handling profiling specially in some optimizations if we want to. Finally, this drastically simplifies the frontend and puts all of the lowering logic in one place. llvm-svn: 223672	2014-12-08 18:02:35 +00:00
Tim Northover	67be569a31	AArch64: treat HFAs containing "half" types as blocks too. llvm-svn: 223669	2014-12-08 17:54:58 +00:00
Andrea Di Biagio	d80836ed09	[X86] Improved tablegen patters for matching TZCNT/LZCNT. Teach ISel how to match a TZCNT/LZCNT from a conditional move if the condition code is X86_COND_NE. Existing tablegen patterns only allowed to match TZCNT/LZCNT from a X86cond with condition code equal to X86_COND_E. To avoid introducing extra rules, I added an 'ImmLeaf' definition that checks if the condition code is COND_E or COND_NE. llvm-svn: 223668	2014-12-08 17:47:18 +00:00
Colin LeMahieu	b56e6cd9b9	[Hexagon] Adding combine reg, reg with predicated forms. llvm-svn: 223667	2014-12-08 17:33:06 +00:00
Colin LeMahieu	a55070dbdd	[Hexagon] Adding packhl instruction. llvm-svn: 223664	2014-12-08 17:01:18 +00:00
Daniel Sanders	c8a040c390	[mips] Add Mips-specific CCIf's for accessing the MipsCCState. NFC. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6213 llvm-svn: 223662	2014-12-08 15:40:09 +00:00
Andrea Di Biagio	64bc246f3f	[X86] Improved lowering of packed v8i16 vector shifts by non-constant count. Before this patch, the backend sub-optimally expanded the non-constant shift count of a v8i16 shift into a sequence of two 'movd' plus 'movzwl'. With this patch the backend checks if the target features sse4.1. If so, then it lets the shuffle legalizer deal with the expansion of the shift amount. Example: ;; define <8 x i16> @test(<8 x i16> %A, <8 x i16> %B) { %shamt = shufflevector <8 x i16> %B, <8 x i16> undef, <8 x i32> zeroinitializer %shl = shl <8 x i16> %A, %shamt ret <8 x i16> %shl } ;; Before (with -mattr=+avx): vmovd %xmm1, %eax movzwl %ax, %eax vmovd %eax, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq Now: vpxor %xmm2, %xmm2, %xmm2 vpblendw $1, %xmm1, %xmm2, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq llvm-svn: 223660	2014-12-08 14:36:51 +00:00
Rafael Espindola	3519da82b8	Move the ValueMap lookup inside linkFunctionBody. NFC. llvm-svn: 223659	2014-12-08 14:25:26 +00:00
Rafael Espindola	a314d1aca4	Use range loops. NFC. llvm-svn: 223658	2014-12-08 14:20:10 +00:00
Rafael Espindola	21ec84eb81	Use range loops. NFC. llvm-svn: 223657	2014-12-08 14:05:33 +00:00
Rafael Espindola	869d1ce811	Fix linking of prologue data. It would crash when the function was lazy linked. llvm-svn: 223656	2014-12-08 13:44:38 +00:00
Rafael Espindola	f97d0cbe58	Simple style fixes. * Use a range loop. * Move simple continue checks earlier. * clang-format. llvm-svn: 223654	2014-12-08 13:35:09 +00:00
Rafael Espindola	40d7ebed8a	Move materialize/Dematerialize calls to linkFunctionBody. NFC. Just less code duplication. llvm-svn: 223653	2014-12-08 13:29:33 +00:00
Elena Demikhovsky	68e04b8613	X86 intrinsics moved form X86ISelLowering.cpp to X86IntrinsicsInfo.h X86ISelLowering.cpp has a long switch for intrinsics. I moved a part of this long switch to the new intrinsics table in X86IntrinsicsInfo.h. No functional changes, just code and compile time optimization. llvm-svn: 223641	2014-12-08 09:03:08 +00:00
NAKAMURA Takumi	2b6e662672	Revert a part of r223583, for now. It seems causing different emission between stage2(gcc-clang) and stage3 clang. Investigating. llvm-svn: 223624	2014-12-08 02:07:22 +00:00
Duncan P. N. Exon Smith	9c51b50a71	IR: Revert r223618 behaviour of MDNode::concatenate() r223618 including special handling of `MDNode::intersect()`: if the first operand is a self-reference with the same operands you're trying to return, return it instead. Reuse that handling in `MDNode::concatenate()` in the hopes that it fixes a polly test that seems to rely on the old behaviour [1]. [1]: http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/25167 llvm-svn: 223619	2014-12-07 20:32:11 +00:00
Duncan P. N. Exon Smith	ac8ee289eb	IR: Drop uniquing for self-referencing MDNodes It doesn't make sense to unique self-referencing nodes. Drop uniquing for them. Note that `MDNode::intersect()` occasionally returns self-referencing nodes. Previously these would be returned by `MDNode::get()`. I'm not convinced this was intended behaviour -- to me it seems it should return a node whose only operand is the self-reference -- but I don't know much about alias scopes so I'm preserving it for now. This is part of PR21532. llvm-svn: 223618	2014-12-07 19:52:06 +00:00
Duncan P. N. Exon Smith	545a9b0f51	IR: Add missing tests for function-local metadata Add assembly and bitcode tests that I neglected to add in r223564 (IR: Disallow complicated function-local metadata) and r223574 (IR: Disallow function-local metadata attachments). Found a couple of bugs: - The error message for function-local attachments gave the wrong line number -- it indicated the next token (typically on the next line) instead of the token that started the attachment. Fixed. - Metadata arguments of the form `!{i32 0, i32 %v}` (or with the arguments reversed) fired an assertion in `ValueEnumerator` in LLVM v3.5, so I suppose this never really worked. I suppose this was "fixed" by r223564. (Thanks to dblaikie for pointing out my omission.) Part of PR21532. llvm-svn: 223616	2014-12-07 17:56:16 +00:00
Marek Olsak	fa58e5e111	R600/SI: Disable VMEM and SMEM clauses by breaking them with S_NOP This is only a workaround. llvm-svn: 223615	2014-12-07 17:17:43 +00:00
Marek Olsak	58f61a84e7	R600/SI: Set 20-bit immediate byte offset for SMRD on VI llvm-svn: 223614	2014-12-07 17:17:38 +00:00
Marek Olsak	be047806d1	R600/SI: Update instruction conversions for VI There are 3 changes: - Convert 32-bit S_LSHL/LSHR/ASHR to their V_*REV variants for VI - Lower RSQ_CLAMP for VI - Don't generate MIN/MAX_LEGACY on VI llvm-svn: 223604	2014-12-07 12:19:03 +00:00
Marek Olsak	5df00d63e2	R600/SI: Add VI instructions llvm-svn: 223603	2014-12-07 12:18:57 +00:00
Marek Olsak	b08604c4cd	R600/SI: Add SCC Defs/Uses to SOP1 and SOP2 opcodes llvm-svn: 223602	2014-12-07 12:18:45 +00:00
Benjamin Kramer	3280a5d9f5	Turn some DenseMaps that are only used for set operations into DenseSets. DenseSet has better memory efficiency now. llvm-svn: 223589	2014-12-06 19:22:54 +00:00
Benjamin Kramer	89e5306f43	Make the DenseMap bucket type configurable and use a smaller bucket for DenseSet. DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled the memory footprint of the map. Now we use a compressed set so the second element uses no memory at all. This required some surgery on DenseMap as all accesses to the bucket now have to go through methods; this should have no impact on the behavior of DenseMap though. The new default bucket type for DenseMap is a slightly extended std::pair as we expose it through DenseMap's iterator and don't want to break any existing users. llvm-svn: 223588	2014-12-06 19:22:44 +00:00
Benjamin Kramer	8e5dc53784	Reapply "LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps." This reapplies r223478 with a fix for 32 bit targets. llvm-svn: 223586	2014-12-06 13:12:56 +00:00
David Majnemer	64ba326b1e	ConstantFold: Don't optimize comparisons with weak linkage objects Consider: void f() {} void __attribute__((weak)) g() {} bool b = &f != &g; It's possble for g to resolve to f if --defsym=g=f is passed on to the linker. llvm-svn: 223585	2014-12-06 11:58:33 +00:00
David Majnemer	ed00cd20ad	I didn't intend to commit this change. llvm-svn: 223584	2014-12-06 10:52:32 +00:00
David Majnemer	1af36e5baf	InstSimplify: Optimize away useless unsigned comparisons Code like X < Y && Y == 0 should always be folded away to false. llvm-svn: 223583	2014-12-06 10:51:40 +00:00
NAKAMURA Takumi	fc3062f65a	Reformat. llvm-svn: 223580	2014-12-06 05:57:06 +00:00
Tom Stellard	8d5f5e4238	R600/SI: Restore PrivateGlobalPrefix to the default ELF value of ".L" This was changed in r223323. llvm-svn: 223579	2014-12-06 05:34:34 +00:00
Duncan P. N. Exon Smith	35303fd739	IR: Disallow function-local metadata attachments Metadata attachments to instructions cannot be function-local. This is part of PR21532. llvm-svn: 223574	2014-12-06 02:29:44 +00:00
NAKAMURA Takumi	6980404cfe	LLVMInstrumentation requires MC since r223532. llvm-svn: 223573	2014-12-06 02:22:11 +00:00
Ahmed Bougacha	8b54286d1c	[X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns. Most patterns will go away once the extload legalization changes land. Differential Revision: http://reviews.llvm.org/D6125 llvm-svn: 223567	2014-12-06 01:31:07 +00:00
Hans Wennborg	08de833c1c	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. SimplifyCFG currently does this transformation, but I'm working towards changing that so we can optimize harder based on unreachable defaults. Differential Revision: http://reviews.llvm.org/D6510 llvm-svn: 223566	2014-12-06 01:28:50 +00:00
Duncan P. N. Exon Smith	da41af9e94	IR: Disallow complicated function-local metadata Disallow complex types of function-local metadata. The only valid function-local metadata is an `MDNode` whose sole argument is a non-metadata function-local value. Part of PR21532. llvm-svn: 223564	2014-12-06 01:26:49 +00:00
Duncan P. N. Exon Smith	b236211c4c	Utils: Style cleanups, NFC llvm-svn: 223556	2014-12-06 00:48:17 +00:00
Duncan P. N. Exon Smith	b13f7d2e36	Utils: Avoid RAUW on metadata in CloneFunction() llvm-svn: 223555	2014-12-06 00:48:13 +00:00
Nick Lewycky	05044c248e	Canonicalize multiplies by looking at whether the operands have any constants themselves. Patch by Tim Murray! llvm-svn: 223554	2014-12-06 00:45:50 +00:00
Tim Northover	5e84fe3ed4	AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes. All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in their choice. No functional change. llvm-svn: 223551	2014-12-06 00:33:37 +00:00
Benjamin Kramer	0dc0e54272	Revert "LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps." Somehow made DenseMap probe on forever on 32 bit machines. This reverts commit r223478. llvm-svn: 223546	2014-12-06 00:02:31 +00:00
Ahmed Bougacha	89bc485085	[X86] Cleanup FCOPYSIGN lowering. NFC intended. llvm-svn: 223542	2014-12-05 23:11:36 +00:00
Kuba Brecka	1001bb533b	Recommit of r223513 and r223514. Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223532	2014-12-05 22:19:18 +00:00
Colin LeMahieu	d8b766072b	[Hexagon] Relocating logical instructions and templates later in the td file. llvm-svn: 223523	2014-12-05 21:51:12 +00:00
Colin LeMahieu	2c77a35e6e	[Hexagon] Adding sub/and/or reg, imm forms llvm-svn: 223522	2014-12-05 21:38:29 +00:00
Rafael Espindola	de567e022b	Remove dead code. We are only lazy about functions with bodies. llvm-svn: 223521	2014-12-05 21:36:06 +00:00
Kuba Brecka	086e34bef8	Reverting r223513 and r223514. llvm-svn: 223520	2014-12-05 21:32:46 +00:00
Sanjay Patel	4bf9b7685c	Optimize merging of scalar loads for 32-byte vectors [X86, AVX] Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ). Before we crack 32-byte build vectors into smaller chunks (and then subsequently glue them back together), we should look for the easy case where we can just load all elements in a single op. An example of the codegen change is: From: vmovss 16(%rdi), %xmm1 vmovups (%rdi), %xmm0 vinsertps $16, 20(%rdi), %xmm1, %xmm1 vinsertps $32, 24(%rdi), %xmm1, %xmm1 vinsertps $48, 28(%rdi), %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 retq To: vmovups (%rdi), %ymm0 retq Differential Revision: http://reviews.llvm.org/D6536 llvm-svn: 223518	2014-12-05 21:28:14 +00:00
Peter Collingbourne	0826e60748	[DFSAN][MIPS][LLVM] Defining ShadowPtrMask variable for MIPS64 Patch by Kumar Sukhani! corresponding compiler-rt patch: http://reviews.llvm.org/D6437 clang patch: http://reviews.llvm.org/D6147 Differential Revision: http://reviews.llvm.org/D6459 llvm-svn: 223516	2014-12-05 21:22:32 +00:00
Colin LeMahieu	9665f98c10	[Hexagon] Updating mux_ir/ri/ii/rr with encoding bits llvm-svn: 223515	2014-12-05 21:09:27 +00:00
Kuba Brecka	1e21378a37	AddressSanitizer - Don't instrument globals from cstring_literals sections. (llvm part) Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223513	2014-12-05 21:04:43 +00:00
Rafael Espindola	28a2451b35	Simplify the loop linking function bodies. NFC. llvm-svn: 223512	2014-12-05 21:04:36 +00:00
Jan Wen Voung	f547861ba0	Use 32-bit ebp for NaCl64 in a limited case: llvm.frameaddress. Summary: Follow up to [x32] "Use ebp/esp as frame and stack pointer": http://reviews.llvm.org/D4617 In that earlier patch, NaCl64 was made to always use rbp. That's needed for most cases because rbp should hold a full 64-bit address within the NaCl sandbox so that load/stores off of rbp don't require sandbox adjustment (zeroing the top 32-bits, then filling those by adding r15). However, llvm.frameaddress returns a pointer and pointers are 32-bit for NaCl64. In this case, use ebp instead, which will make the register copy type check. A similar mechanism may be needed for llvm.eh.return, but is not added in this change. Test Plan: test/CodeGen/X86/frameaddr.ll Reviewers: dschuff, nadav Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6514 llvm-svn: 223510	2014-12-05 20:55:53 +00:00
Bill Seurer	8c728ae9fb	[PowerPC]Add VSX loads/stores to fastisel for PPC target This patch adds VSX floating point loads and stores to fastisel. Along with the change to tablegen (D6220), VSX instructions are now fully supported in fastisel. http://reviews.llvm.org/D6274 llvm-svn: 223507	2014-12-05 20:15:56 +00:00
Colin LeMahieu	19985e9a8d	[Hexagon] Adding tfrih/l instructions. llvm-svn: 223506	2014-12-05 20:07:19 +00:00
Andrea Di Biagio	3e425c8d19	[X86] Improved lowering of packed vector shifts to vpsllq/vpsrlq. SSE2/AVX non-constant packed shift instructions only use the lower 64-bit of the shift count. This patch teaches function 'getTargetVShiftNode' how to deal with shifts where the shift count node is of type MVT::i64. Before this patch, function 'getTargetVShiftNode' only knew how to deal with shift count nodes of type MVT::i32. This forced the backend to wrongly truncate the shift count to MVT::i32, and then zero-extend it back to MVT::i64. llvm-svn: 223505	2014-12-05 20:02:22 +00:00
Colin LeMahieu	a4ab58101a	[Hexagon] Adding add reg, imm form with encoding bits and test. llvm-svn: 223504	2014-12-05 19:51:23 +00:00
Rafael Espindola	2bd5b9f558	Remove unused arguments. NFC. llvm-svn: 223503	2014-12-05 19:35:07 +00:00
Eric Christopher	d1fb7e4590	These two calls were grabbing the same register info. Unify them. llvm-svn: 223502	2014-12-05 19:23:55 +00:00
Duncan P. N. Exon Smith	57cbdfc99a	BFI: Saturate when combining edges to a successor When a loop gets bundled up, its outgoing edges are quite large, and can just barely overflow 64-bits. If one successor has multiple incoming edges -- and that successor is getting all the incoming mass -- combining just its edges can overflow. Handle that by saturating rather than asserting. This fixes PR21622. llvm-svn: 223500	2014-12-05 19:13:42 +00:00
Colin LeMahieu	383c36e3a8	[Hexagon] Adding DoubleRegs decoder. Moving C2_mux and A2_nop. Adding combine imm-imm form. llvm-svn: 223494	2014-12-05 18:24:06 +00:00
Adrian Prantl	b9a88e2942	Fix a bug when pretty-printing DW_OP_deref. llvm-svn: 223493	2014-12-05 18:19:38 +00:00
Ahmed Bougacha	55e3c2d9cf	[CodeGenPrepare] Use variables for reused values. NFC. llvm-svn: 223491	2014-12-05 18:04:40 +00:00
Colin LeMahieu	63035ebee1	[Hexagon] [NFC] Rearranging patterns and mux instruction. llvm-svn: 223488	2014-12-05 17:58:06 +00:00
Colin LeMahieu	7358593e34	[Hexagon] [NFC] Rearranging def order. llvm-svn: 223487	2014-12-05 17:55:51 +00:00
Rafael Espindola	26c2951117	Refactor duplicated code. NFC. llvm-svn: 223486	2014-12-05 17:53:15 +00:00
Colin LeMahieu	7f0a430c7d	[Hexagon] Adding combine reg-reg forms. llvm-svn: 223485	2014-12-05 17:38:36 +00:00
Colin LeMahieu	01785bb063	[Hexagon] Marking several instructions as isCodeGenOnly=0 and adding direct disassembly tests for many instructions. llvm-svn: 223482	2014-12-05 17:27:39 +00:00
Benjamin Kramer	f8caa28517	LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps. Required some APInt massaging to get proper empty/tombstone values. Apart from making the code a bit simpler this also reduces the bucket size of the ConstantInt map from 32 to 24 bytes. llvm-svn: 223478	2014-12-05 17:03:01 +00:00
Rafael Espindola	879aeb776c	Small cleanup on how we clear constant variables. NFC. llvm-svn: 223474	2014-12-05 16:05:19 +00:00
Rafael Espindola	ad9d0ca878	Use an early return. NFC. llvm-svn: 223470	2014-12-05 15:42:30 +00:00
Evgeniy Stepanov	d85ddee01d	[msan] Avoid extra origin address realignment. Do not realign origin address if the corresponding application address is at least 4-byte-aligned. Saves 2.5% code size in track-origins mode. llvm-svn: 223464	2014-12-05 14:34:03 +00:00
Andrea Di Biagio	2876a67312	[X86] Avoid introducing extra shuffles when lowering packed vector shifts. When lowering a vector shift node, the backend checks if the shift count is a shuffle with a splat mask. If so, then it introduces an extra dag node to extract the splat value from the shuffle. The splat value is then used to generate a shift count of a target specific shift. However, if we know that the shift count is a splat shuffle, we can use the splat index 'I' to extract the I-th element from the first shuffle operand. The advantage is that the splat shuffle may become dead since we no longer use it. Example: ;; define <4 x i32> @example(<4 x i32> %a, <4 x i32> %b) { %c = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer %shl = shl <4 x i32> %a, %c ret <4 x i32> %shl } ;; Before this patch, llc generated the following code (-mattr=+avx): vpshufd $0, %xmm1, %xmm1 # xmm1 = xmm1[0,0,0,0] vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq With this patch, the redundant splat operation is removed from the code. vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq llvm-svn: 223461	2014-12-05 12:13:30 +00:00
Charlie Turner	c96e95c157	Add missing FP build attribute tests. The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454	2014-12-05 08:22:47 +00:00
Hal Finkel	66d7791176	Revert "r223440 - Consider subregs when calling MI::registerDefIsDead for phys deps" Reverting this because, while it fixes the problem in the reduced test case, it does not fix the problem in the full test case from the bug report. llvm-svn: 223442	2014-12-05 02:07:35 +00:00
Hal Finkel	d013d99fe0	Consider subregs when calling MI::registerDefIsDead for phys deps The scheduling dependency graph is built bottom-up within each scheduling region, and ScheduleDAGInstrs::addPhysRegDeps is called to add output/anti dependencies, based on physical registers, to the SUs for instructions based on those that come before them. In the test case, we start before post-RA scheduling with a block that looks like this: ... INLINEASM <... andc $0,$0,$2 stdcx. $0,0,$3 bne- 1b > [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef-ec:G8RC], %X6<earlyclobber,def,dead>, $1:[mem], %X3<kill>, $2:[reguse:G8RC], %X5<kill>, $3:[reguse:G8RC], %X3, $4:[mem], %X3, $5:[clobber], %CC<earlyclobber,imp-def,dead>, <<badref>> ... %X4<def,dead> = ANDIo8 %X4<kill>, 1, %CR0<imp-def,dead>, %CR0GT<imp-def> ... %R29<def> = ISEL %R3<undef>, %R4<kill>, %CR0GT<kill> where it is relevant that %CC is an alias to %CR0, and that %CR0GT is a subregister of %CR0. However, for post-RA scheduling, no dependency was added to prevent the INLINEASM from being scheduled in between the ANDIo8 and the ISEL (which communicate via the %CR0GT register). In ScheduleDAGInstrs::addPhysRegDeps, when called for the %CC operand, we'd iterate over all of its aliases (which include %CC itself and also %CR0), and look for previously-encountered defs of those registers. We'd find the ANDIo8, but decide not to add a dependency between the INLINEASM and the ANDIo8 because both the INLINEASM's def of %CC is dead, and also the ANDIo8 def of %CR0 is dead. This ignores, however, that ANDIo8 has a non-dead def of %CR0GT, a subregister of %CR0, and thus a dependency still must exist. To fix this problem, when calling registerDefIsDead on the SU with the def, we also check all subregisters for possible non-dead defs, and add the dependency if any are found. Fixes PR21742. llvm-svn: 223440	2014-12-05 01:57:22 +00:00
Duncan P. N. Exon Smith	c1a664fea2	IR: Stop relying on GetStringMapEntryFromValue() It relies on undefined behaviour. llvm-svn: 223438	2014-12-05 01:41:34 +00:00
Adrian Prantl	ab255fcd09	Cleanup: Calls to getDwarfRegNum() may actually fail, if there is no DWARF register number mapping, or if the register was a virtual register that was never materialized. Previously, we would just emit a bogus location, after this patch we don't emit a location at all by doing an early exit. After my bugfix in r223401 today, this doesn't actually happen on any target that I tested this with, but it's still preferable to make the possibility of a failure explicit. llvm-svn: 223428	2014-12-05 01:02:46 +00:00
Rafael Espindola	3124ed4b23	linkGlobalVariableProto never returns null. Simplify the caller. NFC. llvm-svn: 223424	2014-12-05 00:30:47 +00:00
Eric Christopher	2189515132	Rename the x86 isTargetMacho to isTargetMachO for uniformity. llvm-svn: 223421	2014-12-05 00:22:38 +00:00
Eric Christopher	66322e822c	Both of these subtargets have functions that check whether or not the target is mach-o. Use them. llvm-svn: 223420	2014-12-05 00:22:35 +00:00
Rafael Espindola	439835a6fe	Move merging of alignment to a central location. NFC. llvm-svn: 223418	2014-12-05 00:09:02 +00:00
Ahmed Bougacha	24ebb93da1	[X86] Delete dead code in fcopysign lowering. NFC. r32900 introduced custom lowering for fcopysign, with two checks to change the magnitude value's type if it's larger/smaller than the sign value's type. r32932 replaced that code for the smaller case. r43205 did the same for the larger case, but left the old code, now dead. llvm-svn: 223415	2014-12-04 23:52:15 +00:00
Adrian Prantl	da7e03f1bf	Simplify implementation and testcase of r223401 based on feedback from dblaikie. llvm-svn: 223405	2014-12-04 22:58:41 +00:00
Adrian Prantl	a3ae0b3b5b	Debug info: If the RegisterCoalescer::reMaterializeTrivialDef() is eliminating all uses of a vreg, update any DBG_VALUE describing that vreg to point to the rematerialized register instead. llvm-svn: 223401	2014-12-04 22:29:04 +00:00
Roman Divacky	6fd64ff577	Add a FIXME as requested by Renato Golin. llvm-svn: 223390	2014-12-04 21:39:24 +00:00
Yaron Keren	d908941236	Silence warning: variable 'buffer' set but not used. llvm-svn: 223389	2014-12-04 21:36:38 +00:00
Bruno Cardoso Lopes	fd52b95530	[x86] Fix isOffsetSuitableForCodeModel kernel code model offset Offset == 0 is a valid offset for kernel code model according to the x86_64 System V ABI. Found by inspection, no testcase. llvm-svn: 223383	2014-12-04 20:36:06 +00:00
Weiming Zhao	cc4bf3ff3d	[AArch64] Combining Load and IntToFp should check for neon availability llvm-svn: 223382	2014-12-04 20:25:50 +00:00
Asiri Rathnayake	13cef35cba	Fix yet another unseen regression caused by r223113 r223113 added support for ARM modified immediate assembly syntax. Which assumes all immediate operands are prefixed with a '#'. This assumption is wrong as per the ARMARM - which recommends that all '#' characters be treated optional. The current patch fixes this regression and adds a test case. A follow-up patch will expand the test coverage to other instructions. llvm-svn: 223381	2014-12-04 19:34:59 +00:00
Jonathan Roelofs	300d8ffdf2	Fix thumbv4t indirect calls So there are a couple of issues with indirect calls on thumbv4t. First, the most 'obvious' instruction, 'blx' isn't available until v5t. And secondly, the next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code because the saved off pc has its thumb bit cleared, so when the callee returns we end up in ARM mode.... yuck. The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it. We could cut down on code size by sharing the landing pads between call sites that are close enough, but for the moment let's do correctness first and look at performance later. Patch by: Iain Sandoe http://reviews.llvm.org/D6519 llvm-svn: 223380	2014-12-04 19:34:50 +00:00
Hal Finkel	aa19bafc9c	Revert "r223364 - Revert r223347 which has caused crashes on bootstrap bots." Reapply r223347, with a fix to not crash on uninserted instructions (or more precisely, instructions in uninserted blocks). bugpoint was able to reduce the test case somewhat, but it is still somewhat large (and relies on setting things up to be simplified during inlining), so I've not included it here. Nevertheless, it is clear what is going on and why. Original commit message: Restrict somewhat the memory-allocation pointer cmp opt from r223093 Based on review comments from Richard Smith, restrict this optimization from applying to globals that might resolve lazily to other dynamically-loaded modules, and also from dynamic allocas (which might be transformed into malloc calls). In short, take extra care that the compared-to pointer is really simultaneously live with the memory allocation. llvm-svn: 223371	2014-12-04 17:45:19 +00:00
Philip Reames	eafafa37bc	Fix a typo: use of cast where dyn_cast was intended This bug has the effect of converting a test of isGCRelocate(InvokeInst*) from a false return to a crash. This may be the root cause of the crash Joerg reported against r223137, but I'm still waiting for a clean build of clang to complete to be able to confirm. Once I've confirmed the issue, I'll submit a test case separately. llvm-svn: 223370	2014-12-04 17:27:58 +00:00
Rafael Espindola	c0610bf4e0	Remove dead code. NFC. This interface was added 2 years ago but users never developed. llvm-svn: 223368	2014-12-04 16:59:36 +00:00
Asiri Rathnayake	d33304b3ad	Fix a minor regression introduced in r223113 r223113 added support for ARM modified immediate assembly syntax. That patch has broken support for immediate expressions, as in: add r0, #(4 * 4) It wasn't caught because we don't have any tests for this feature. This patch fixes this regression and adds test cases. llvm-svn: 223366	2014-12-04 14:49:07 +00:00
Alexander Potapenko	76770e4930	Revert r223347 which has caused crashes on bootstrap bots. llvm-svn: 223364	2014-12-04 14:22:27 +00:00
Rafael Espindola	5403da4569	Revert "[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 >" This reverts commit r223356. It was failing check-all (MC/ARM/thumb.s in particular). llvm-svn: 223363	2014-12-04 14:10:20 +00:00
Michael Kuperstein	0492bd2b9e	[X86] Improve a dag-combine that handles a vector extract -> zext sequence. The current DAG combine turns a sequence of extracts from <4 x i32> followed by zexts into a store followed by scalar loads. According to measurements by Martin Krastev (see PR 21269) for x86-64, a sequence of an extract, movs and shifts gives better performance. However, for 32-bit x86, the previous sequence still seems better. Differential Revision: http://reviews.llvm.org/D6501 llvm-svn: 223360	2014-12-04 13:49:51 +00:00
Jyoti Allur	b24d0abfe3	[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 > llvm-svn: 223356	2014-12-04 11:52:49 +00:00
Andrea Di Biagio	61fac30180	[X86] Simplify code. NFC. Replaced some logic that checked if a build_vector node is doing a splat of a non-undef value with a call to method BuildVectorSDNode::getSplatValue(). No functional change intended. llvm-svn: 223354	2014-12-04 11:21:44 +00:00
Patrik Hagglund	d06de4b954	Use DomTree in MachineSink to sink over diamonds. According to a previous FIXME comment we now not only look at MBB successors, but also handle code sinking past them: x = computation if () {} else {} use x The instruction could be sunk over the whole diamond for the if/then/else (or loop, etc), allowing it to be sunk into other blocks after that. Modified test added in r204522, due to one spill less present. Minor fixes in comments. Patch provided by Jonas Paulsson. Reviewed by Hal Finkel. llvm-svn: 223350	2014-12-04 10:36:42 +00:00
Simon Pilgrim	be24ab367b	[InstCombine] Minor optimization for bswap with binary ops Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349	2014-12-04 09:44:01 +00:00
Elena Demikhovsky	f1de34b84d	Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again. Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 223348	2014-12-04 09:40:44 +00:00
Hal Finkel	8b24b32c44	Restrict somewhat the memory-allocation pointer cmp opt from r223093 Based on review comments from Richard Smith, restrict this optimization from applying to globals that might resolve lazily to other dynamically-loaded modules, and also from dynamic allocas (which might be transformed into malloc calls). In short, take extra care that the compared-to pointer is really simultaneously live with the memory allocation. llvm-svn: 223347	2014-12-04 09:22:28 +00:00
Yaron Keren	56919ef104	clang-formatted ranged loops and assignment, NFC. llvm-svn: 223344	2014-12-04 08:30:39 +00:00
Jean-Daniel Dupas	00cc1f5cab	Add mach-o LC_RPATH support to llvm-objdump Summary: Add rpath load command support in Mach-O object and update llvm-objdump to use it. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6512 llvm-svn: 223343	2014-12-04 07:37:02 +00:00
Michael Liao	5bf9578ce4	[X86] Clean up whitespace as well as minor coding style llvm-svn: 223339	2014-12-04 05:20:33 +00:00
Colin LeMahieu	5d6f03bd5a	[Hexagon] Marking some instructions as CodeGenOnly=0 and adding disassembly tests. llvm-svn: 223334	2014-12-04 03:41:21 +00:00
Michael Liao	d8faa61b20	[X86] Restore X86 base pointer after call to llvm.eh.sjlj.setjmp Commit on - This patch fixes the bug described in http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-May/062343.html The fix allocates an extra slot just below the GPRs and stores the base pointer there. This is done only for functions containing llvm.eh.sjlj.setjmp that also need a base pointer. Because code containing llvm.eh.sjlj.setjmp saves all of the callee-save GPRs in the prologue, the offset to the extra slot can be computed before prologue generation runs. Impact at run-time on affected functions is:: - One extra store in the prologue, The store saves the base pointer. - One extra load after a llvm.eh.sjlj.setjmp. The load restores the base pointer. Because the extra slot is just above a gap between frame-pointer-relative and base-pointer-relative chunks of memory, there is no impact on other offset calculations other than ensuring there is room for the extra slot. http://reviews.llvm.org/D6388 Patch by Arch Robison <arch.robison@intel.com> llvm-svn: 223329	2014-12-04 00:56:38 +00:00
Hal Finkel	029042b278	[PowerPC] 'cc' should be an alias only to 'cr0' We had mistakenly believed that GCC's 'cc' referred to the entire condition-code register (cr0 through cr7) -- and implemented this in r205630 to fix PR19326, but 'cc' is actually an alias only to 'cr0'. This is causing LLVM to clobber too much with legacy code with inline asm using the 'cc' clobber. Fixes PR21451. llvm-svn: 223328	2014-12-04 00:46:20 +00:00
NAKAMURA Takumi	597fbb5230	HexagonMCInst.h: Qualify constants explicitly to appease msc17. llvm-svn: 223325	2014-12-04 00:26:39 +00:00
Matt Arsenault	4e27343eec	Allow target to specify prefix for labels Use the MCAsmInfo instead of the DataLayout, and allow specifying a custom prefix for labels specifically. HSAIL requires that labels begin with @, but global symbols with &. llvm-svn: 223323	2014-12-04 00:06:57 +00:00
Philip Reames	a7eb3cb46e	A few more checks for gc.statepoints in the Verifier This is simply a grab bag of unrelated checks: - A statepoint call can't be marked readonly or readnone - We don't currently support inline asm or varadic target functions. Both could be supported, but don't currently work. - I forgot to check that the number of call arguments actually matched the wrapped callee in my previous change. Included here. llvm-svn: 223322	2014-12-04 00:01:48 +00:00
Hal Finkel	d433838adf	[PowerPC] Fix inline asm memory operands not to use r0 On PowerPC, inline asm memory operands might be expanded as 0($r), where $r is a register containing the address. As a result, this register cannot be r0, and we need to enforce this register subclass constraint to prevent miscompiling the code (we'd get this constraint for free with the usual instruction definitions, but that scheme has no knowledge of how we end up printing inline asm memory operands, and so here we need to do it 'by hand'). We can accomplish this within the current address-mode selection framework by introducing an explicit COPY_TO_REGCLASS node. Fixes PR21443. llvm-svn: 223318	2014-12-03 23:40:13 +00:00
Quentin Colombet	079aba733a	[RegAllocFast] Handle implicit definitions conservatively. Prior to this commit, physical registers defined implicitly were considered free right after their definition, i.e.. like dead definitions. Therefore, their uses had to immediately follow their definitions, otherwise the related register may be reused to allocate a virtual register. This commit fixes this assumption by keeping implicit definitions alive until they are actually used. The downside is that if the implicit definition was dead (and not marked at such), we block an otherwise available register. This is however conservatively correct and makes the fast register allocator much more robust in particular regarding the scheduling of the instructions. Fixes PR21700. llvm-svn: 223317	2014-12-03 23:38:08 +00:00
Kostya Serebryany	543f3db572	[msan] allow -fsanitize-coverage=N together with -fsanitize=memory, llvm part llvm-svn: 223312	2014-12-03 23:28:26 +00:00
Jacques Pienaar	0c7dc9f7c3	Test commit. llvm-svn: 223310	2014-12-03 23:21:02 +00:00
Rafael Espindola	31ad468d03	Split the set of identified struct types into opaque and non-opaque ones. The non-opaque part can be structurally uniqued. To keep this to just a hash lookup, we don't try to unique cyclic types. Also change the type mapping algorithm to be optimistic about a type not being recursive and only create a new type when proven to be wrong. This is not as strong as trying to speculate that we can keep the source type, but is simpler (no speculation to revert) and more powerfull than what we had before (we don't copy non-recursive types at least). I initially wrote this to try to replace the name based type merging. It is not strong enough to replace it, but is is a useful addition. With this patch the number of named struct types is a clang lto bootstrap goes from 49674 to 15986. llvm-svn: 223278	2014-12-03 22:36:37 +00:00
Sanjay Patel	23b7ce2725	fix typos, grammar, formatting; NFC llvm-svn: 223276	2014-12-03 22:28:05 +00:00
Philip Reames	b23713aae9	Strength Verifier checks around the types involved in a statepoint Add checks that the types in a gc.statepoint sequence match the wrapper callee and that relocating a pointer doesn't change it's type. llvm-svn: 223275	2014-12-03 22:23:24 +00:00
Matthias Braun	395a82f6cc	correct spelling, NFC llvm-svn: 223274	2014-12-03 22:10:39 +00:00
Matthias Braun	d34e4d2354	[SimplifyLibCalls] Improve double->float shrinking to consider constants This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x); rdar://19049359 Differential Revision: http://reviews.llvm.org/D6496 llvm-svn: 223270	2014-12-03 21:46:33 +00:00
Matthias Braun	892c923c46	[SimplifyLibCalls] Enable double to float shrinking for copysign rdar://19049359 Differential Revision: http://reviews.llvm.org/D6495 llvm-svn: 223269	2014-12-03 21:46:29 +00:00
Colin LeMahieu	654f2d2037	[Hexagon] Converting member InstrDesc to static variable. llvm-svn: 223268	2014-12-03 21:40:25 +00:00
Colin LeMahieu	7e9908ea10	[Hexagon] Converting subclass members to an implicit operand. llvm-svn: 223264	2014-12-03 20:23:22 +00:00
Philip Reames	38303a329f	Make the Verifier more strict about gc.statepoints The recently added documentation for statepoints claimed that we checked the parameters of the various intrinsics for validity. This patch adds the code to actually do so. I also removed a couple of redundant checks for conditions which are checked elsewhere in the Verifier and simplified the logic using the helper functions from Statepoint.h. llvm-svn: 223259	2014-12-03 19:53:15 +00:00
Will Schmidt	eba49233c3	Add TableGen info for Power8. This is based on the Power7 version, with units added and renamed to match P8. Differential Revision: http://reviews.llvm.org/D6358 llvm-svn: 223257	2014-12-03 18:46:30 +00:00
Roman Divacky	fdf0560997	Change the name to be in style. llvm-svn: 223255	2014-12-03 18:39:44 +00:00
Tom Stellard	05cd445c4d	R600/SI: Move SIInsertWaits into AMDGPUPassConfig::addPreSched2() This pass needs to be run after PrologEpilogInserter, because that pass may inserter spill code which reads or writes memory. llvm-svn: 223253	2014-12-03 18:27:08 +00:00
Tom Stellard	92105e87e8	R600/SI: Don't run SI passes on R600 subtargets llvm-svn: 223252	2014-12-03 18:27:05 +00:00
Tim Northover	293d414380	AArch64: fix wrong-endian parameter passing. The blocked arguments code didn't take account of the hacks needed to support it. llvm-svn: 223247	2014-12-03 17:49:26 +00:00
Colin LeMahieu	089791db48	[NFC] Fixing pendantic warning extra semicolons. llvm-svn: 223246	2014-12-03 17:36:39 +00:00
Colin LeMahieu	1d04fa411f	[Hexagon] [NFC] Moving function implementations out of header. Clang-formatting files. llvm-svn: 223245	2014-12-03 17:35:39 +00:00
Colin LeMahieu	b4e5be4c66	[Hexagon] [NFC] Renaming packetStart to packetBegin llvm-svn: 223243	2014-12-03 17:31:43 +00:00
Aaron Ballman	d58a1f4d98	Silencing a 32-bit implicit conversion warning in MSVC; NFC. llvm-svn: 223237	2014-12-03 14:39:58 +00:00
Evgeniy Stepanov	2e5a1f1c9c	msan] Add compile-time checks for missing origins. This change makes MemorySanitizer instrumentation a bit more strict about instructions that have no origin id assigned to them. This would have caught the bug that was fixed in r222918. This is re-commit of r222997, reverted in r223211, with 3 more missing origins added. llvm-svn: 223236	2014-12-03 14:15:53 +00:00
Erik Eckstein	d181752be0	InstCombine: simplify signed range checks Try to convert two compares of a signed range check into a single unsigned compare. Examples: (icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n (icmp slt x, 0) \| (icmp sgt x, n) --> icmp ugt x, n llvm-svn: 223224	2014-12-03 10:39:15 +00:00
Hal Finkel	c91fc11181	[PowerPC] Print all inline-asm consts as signed numbers Almost all immediates in PowerPC assembly (both 32-bit and 64-bit) are signed numbers, and it is important that we print them as such. To make sure that happens, we change PPCTargetLowering::LowerAsmOperandForConstraint so that it does all intermediate checks on a signed-extended int64_t value, and then creates the resulting target constant using MVT::i64. This will ensure that all negative values are printed as negative values (mirroring what is done in other backends to achieve the same sign-extension effect). This came up in the context of inline assembly like this: "add%I2 %0,%0,%2", ..., "Ir"(-1ll) where we used to print: addi 3,3,4294967295 and gcc would print: addi 3,3,-1 and gas accepts both forms, but our builtin assembler (correctly) does not. Now we print -1 like gcc does. While here, I replaced a bunch of custom integer checks with isInt<16> and friends from MathExtras.h. Thanks to Paul Hargrove for the bug report. llvm-svn: 223220	2014-12-03 09:37:50 +00:00
Charlie Turner	f02c92489a	Emit ABI_FP_rounding attribute. LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218	2014-12-03 08:12:26 +00:00
Rafael Espindola	2fa1e43a22	Ask the module for its the identified types. When lazy reading a module, the types used in a function will not be visible to a TypeFinder until the body is read. This patch fixes that by asking the module for its identified struct types. If a materializer is present, the module asks it. If not, it uses a TypeFinder. This fixes pr21374. I will be the first to say that this is ugly, but it was the best I could find. Some of the options I looked at: * Asking the LLVMContext. This could be made to work for gold, but not currently for ld64. ld64 will load multiple modules into a single context before merging them. This causes us to see types from future merges. Unfortunately, MappedTypes is not just a cache when it comes to opaque types. Once the mapping has been made, we have to remember it for as long as the key may be used. This would mean moving MappedTypes to the Linker class and having to drop the Linker::LinkModules static methods, which are visible from C. * Adding an option to ignore function bodies in the TypeFinder. This would fix the PR by picking the worst result. It would work, but unfortunately we are currently quite dependent on the upfront type merging. I will try to reduce our dependency, but it is not clear that we will be able to get rid of it for now. The only clean solution I could think of is making the Module own the types. This would have other advantages, but it is a much bigger change. I will propose it, but it is nice to have this fixed while that is discussed. With the gold plugin, this patch takes the number of types in the LTO clang binary from 52817 to 49669. llvm-svn: 223215	2014-12-03 07:18:23 +00:00
Nick Lewycky	a4acb44995	Revert r222997. The newly added compile-time checks are finding missing origins, testcase is being reduced and a PR will be posted shortly. llvm-svn: 223211	2014-12-03 05:47:00 +00:00
Duncan P. N. Exon Smith	a48bd07e5e	LoopVectorize: Remove unnecessary RAUW Remove an unnecessary `MDNode::replaceAllUsesWith()`. In the preceding line, `TheLoop->setLoopID()` visits all backedges and sets the new loop ID. This sufficiently updates the loop metadata. Metadata RAUW is going away as part of PR21532. llvm-svn: 223210	2014-12-03 05:41:20 +00:00
Matt Arsenault	120a0c92f4	R600/SI: Fix SIFixSGPRCopies for copies to physical registers This shows up when operands required to be passed in VCC are copied to. llvm-svn: 223208	2014-12-03 05:22:39 +00:00
Matt Arsenault	88652a009b	R600/SI: Remove incorrect assertion This can be a COPY to a physical register, such as VCC llvm-svn: 223207	2014-12-03 05:22:38 +00:00
Matt Arsenault	becd656c7c	R600/SI: Remove i1 pseudo VALU ops Select i1 logical ops directly to 64-bit SALU instructions. Vector i1 values are always really in SGPRs, with each bit for each item in the wave. This saves about 4 instructions when and/or/xoring any condition, and also helps write conditions that need to be passed in vcc. This should work correctly now that the SGPR live range fixing pass works. More work is needed to eliminate the VReg_1 pseudo regclass and possibly the entire SILowerI1Copies pass. llvm-svn: 223206	2014-12-03 05:22:35 +00:00
Matt Arsenault	2f470c62cb	R600/SI: Fix suspicious indexing The loop is over the operands of an instruction, and checks the register with the sub reg index of the dest register. This probably meant to be checking the sub reg index of the same operand. llvm-svn: 223205	2014-12-03 05:22:32 +00:00
Matt Arsenault	691ae3d657	R600/SI: Fix running SILowerI1Copies a second time llvm-svn: 223204	2014-12-03 05:22:30 +00:00
Matt Arsenault	0d2832ae8d	R600/SI: Fix live range error hidden by SIFoldOperands m0 is treated as a virtual register class with a single register rather than the physical register it really is. This was updating the live range of the used virtual copy of m0 from the first ds_read instruction, and leaving the unused copy unchanged. This resulted in a "Live segment doesn't end at a valid instruction" verifier error because the erased instructions. Update the live range of the second copy (which should be dead). No test since I'm not sure how to trigger this with SIFoldOperands enabled. llvm-svn: 223203	2014-12-03 05:22:29 +00:00
Tom Stellard	1f0dded057	StructurizeCFG: Use LoopInfo analysis for better loop detection We were assuming that each back-edge in a region represented a unique loop, which is not always the case. We need to use LoopInfo to correctly determine which back-edges are loops. llvm-svn: 223199	2014-12-03 04:28:32 +00:00
Duncan P. N. Exon Smith	c280ff0e47	NVPTX: Delete dead code `MDNode` does not inherit from `User`, and it never has a name. llvm-svn: 223198	2014-12-03 04:13:23 +00:00
Tom Stellard	369308061b	R600/SI: Enable inline assembly We just needed to remove the assertion in AMDGPURegisterInfo::getFrameRegister(), which is called when initializing the parser for inline assembly. llvm-svn: 223197	2014-12-03 04:08:00 +00:00
Matt Arsenault	fb13b22d9a	R600/SI: Change mubuf offsets to print as decimal This matches SC's behavior. llvm-svn: 223194	2014-12-03 03:12:13 +00:00
Nick Lewycky	2e8a6219fc	Emit the entry block first and the exit block second, then all the blocks in between afterwards. This is what gcc always does, and some out of tree tools depend on that. llvm-svn: 223193	2014-12-03 02:45:01 +00:00
Peter Collingbourne	51d2de7b9e	Prologue support Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189	2014-12-03 02:08:38 +00:00
Ahmed Bougacha	d65f787a5f	[X86][MC] Intel syntax: accept implicit memory operand sizes larger than 80. The X86AsmParser intel handling was refactored in r216481, making it try each different memory operand size to see which one matches. Operand sizes larger than 80 ("[xyz]mmword ptr") were forgotten, which led to an "invalid operand" error for code such as: movdqa [rax], xmm0 llvm-svn: 223187	2014-12-03 02:03:26 +00:00
Lang Hames	4a5697e659	[MCJIT] Unique-ptrify the RTDyldMemoryManager member of MCJIT. NFC. llvm-svn: 223183	2014-12-03 00:51:19 +00:00
Hal Finkel	01fa7701e6	[PowerPC] Fix readcyclecounter to be custom expanded for all 32-bit targets We need to use the custom expansion of readcyclecounter on all 32-bit targets (even those with 64-bit registers). This should fix the ppc64 buildbot. llvm-svn: 223182	2014-12-03 00:19:17 +00:00
Tim Northover	4a8ac260cc	AArch64: strengthen Darwin ABI alignment assumptions A global variable without an explicit alignment specified should be assumed to be ABI-aligned according to its type, like on other platforms. This allows us to use better memory operations when accessing it. rdar://18533701 llvm-svn: 223180	2014-12-02 23:53:43 +00:00

... 7 8 9 10 11 ...

75437 Commits