Phoebe Wang
655ba9c8a1
Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""
...
This resolves problems reported in commit 1a20252978 .
1. Promote to float lowering for nodes XINT_TO_FP
2. Bail out f16 from shuffle combine due to vector type is not legal in the version
2022-06-17 21:34:05 +08:00
Benjamin Kramer
1a20252978
Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""
...
This reverts commit 04a3d5f3a1 .
I see two more issues:
- uitofp/sitofp from i32/i64 to half now generates
__floatsihf/__floatdihf, which exists in neither compiler-rt nor
libgcc
- This crashes when legalizing the bitcast:
```
; RUN: llc < %s -mcpu=skx
define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr {
entry:
%fusion = load ptr, ptr %buffer_table, align 8
%0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
%Arg_1.2 = load ptr, ptr %0, align 8
%1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2
%Arg_0.1 = load ptr, ptr %1, align 8
%2 = load half, ptr %Arg_0.1, align 8
%3 = bitcast half %2 to i16
%4 = and i16 %3, 32767
%5 = icmp eq i16 %4, 0
%6 = and i16 %3, -32768
%broadcast.splatinsert = insertelement <4 x half> poison, half %2, i64 0
%broadcast.splat = shufflevector <4 x half> %broadcast.splatinsert, <4 x half> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert9 = insertelement <4 x i16> poison, i16 %4, i64 0
%broadcast.splat10 = shufflevector <4 x i16> %broadcast.splatinsert9, <4 x i16> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert11 = insertelement <4 x i16> poison, i16 %6, i64 0
%broadcast.splat12 = shufflevector <4 x i16> %broadcast.splatinsert11, <4 x i16> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert13 = insertelement <4 x i16> poison, i16 %3, i64 0
%broadcast.splat14 = shufflevector <4 x i16> %broadcast.splatinsert13, <4 x i16> poison, <4 x i32> zeroinitializer
%wide.load = load <4 x half>, ptr %Arg_1.2, align 8
%7 = fcmp uno <4 x half> %broadcast.splat, %wide.load
%8 = fcmp oeq <4 x half> %broadcast.splat, %wide.load
%9 = bitcast <4 x half> %wide.load to <4 x i16>
%10 = and <4 x i16> %9, <i16 32767, i16 32767, i16 32767, i16 32767>
%11 = icmp eq <4 x i16> %10, zeroinitializer
%12 = and <4 x i16> %9, <i16 -32768, i16 -32768, i16 -32768, i16 -32768>
%13 = or <4 x i16> %12, <i16 1, i16 1, i16 1, i16 1>
%14 = select <4 x i1> %11, <4 x i16> %9, <4 x i16> %13
%15 = icmp ugt <4 x i16> %broadcast.splat10, %10
%16 = icmp ne <4 x i16> %broadcast.splat12, %12
%17 = or <4 x i1> %15, %16
%18 = select <4 x i1> %17, <4 x i16> <i16 -1, i16 -1, i16 -1, i16 -1>, <4 x i16> <i16 1, i16 1, i16 1, i16 1>
%19 = add <4 x i16> %18, %broadcast.splat14
%20 = select i1 %5, <4 x i16> %14, <4 x i16> %19
%21 = select <4 x i1> %8, <4 x i16> %9, <4 x i16> %20
%22 = bitcast <4 x i16> %21 to <4 x half>
%23 = select <4 x i1> %7, <4 x half> <half 0xH7E00, half 0xH7E00, half 0xH7E00, half 0xH7E00>, <4 x half> %22
store <4 x half> %23, ptr %fusion, align 16
ret void
}
```
llc: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:977: void (anonymous namespace)::SelectionDAGLegalize::LegalizeOp(llvm::SDNode *): Assertion `(TLI.getTypeAction(*DAG.getContext(), Op.getValueType()) == TargetLowering::TypeLegal || Op.getOpcode() == ISD::TargetConstant || Op.getOpcode() == ISD::Register) && "Unexpected illegal type!"' failed.
2022-06-17 09:43:07 +02:00
Phoebe Wang
04a3d5f3a1
Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""
...
Fix the crash on lowering X86ISD::FCMP.
2022-06-17 12:12:17 +08:00
Paul Robinson
ff0122dcce
[PS5] Emit ud2 for ubsan trap
2022-06-16 11:20:10 -07:00
Paul Robinson
77b00098f2
[PS5] Use same debug trap instruction as PS4
2022-06-16 11:03:03 -07:00
Frederik Gossen
3cd5696a33
Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""
...
This reverts commit e1c5afa47d .
This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is
below. Let me know if you have trouble reproducing this.
; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"
@0 = private unnamed_addr constant [4 x i8] c"\00\00\00?"
@1 = private unnamed_addr constant [4 x i8] c"\1C}\908"
@2 = private unnamed_addr constant [4 x i8] c"?\00\\4"
@3 = private unnamed_addr constant [4 x i8] c"%ci1"
@4 = private unnamed_addr constant [4 x i8] zeroinitializer
@5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0"
@6 = private unnamed_addr constant [4 x i8] c"\00\00\00B"
@7 = private unnamed_addr constant [4 x i8] c"\94\B4\C22"
@8 = private unnamed_addr constant [4 x i8] c"^\09B6"
@9 = private unnamed_addr constant [4 x i8] c"\15\F3M?"
@10 = private unnamed_addr constant [4 x i8] c"e\CC\\;"
@11 = private unnamed_addr constant [4 x i8] c"d\BD/>"
@12 = private unnamed_addr constant [4 x i8] c"V\F4I="
@13 = private unnamed_addr constant [4 x i8] c"\10\CB,<"
@14 = private unnamed_addr constant [4 x i8] c"\AC\E3\D6:"
@15 = private unnamed_addr constant [4 x i8] c"\DC\A8E9"
@16 = private unnamed_addr constant [4 x i8] c"\C6\FA\897"
@17 = private unnamed_addr constant [4 x i8] c"%\F9\955"
@18 = private unnamed_addr constant [4 x i8] c"\B5\DB\813"
@19 = private unnamed_addr constant [4 x i8] c"\B4W_\B2"
@20 = private unnamed_addr constant [4 x i8] c"\1Cc\8F\B4"
@21 = private unnamed_addr constant [4 x i8] c"~3\94\B6"
@22 = private unnamed_addr constant [4 x i8] c"3Yq\B8"
@23 = private unnamed_addr constant [4 x i8] c"\E9\17\17\BA"
@24 = private unnamed_addr constant [4 x i8] c"\F1\B2\8D\BB"
@25 = private unnamed_addr constant [4 x i8] c"\F8t\C2\BC"
@26 = private unnamed_addr constant [4 x i8] c"\82[\C2\BD"
@27 = private unnamed_addr constant [4 x i8] c"uB-?"
@28 = private unnamed_addr constant [4 x i8] c"^\FF\9B\BE"
@29 = private unnamed_addr constant [4 x i8] c"\00\00\00A"
; Function Attrs: uwtable
define void @main.158(ptr %retval, ptr noalias %run_options, ptr noalias %params, ptr noalias %buffer_table, ptr noalias %status, ptr noalias %prof_counters) #0 {
entry:
%fusion.invar_address.dim.1 = alloca i64, align 8
%fusion.invar_address.dim.0 = alloca i64, align 8
%0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
%Arg_0.1 = load ptr, ptr %0, align 8, !invariant.load !0 , !dereferenceable !1 , !align !2
%1 = getelementptr inbounds ptr, ptr %buffer_table, i64 0
%fusion = load ptr, ptr %1, align 8, !invariant.load !0 , !dereferenceable !1 , !align !2
store i64 0, ptr %fusion.invar_address.dim.0, align 8
br label %fusion.loop_header.dim.0
return: ; preds = %fusion.loop_exit.dim.0
ret void
fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %entry
%fusion.indvar.dim.0 = load i64, ptr %fusion.invar_address.dim.0, align 8
%2 = icmp uge i64 %fusion.indvar.dim.0, 3
br i1 %2, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0
fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0
store i64 0, ptr %fusion.invar_address.dim.1, align 8
br label %fusion.loop_header.dim.1
fusion.loop_header.dim.1: ; preds = %fusion.loop_body.dim.1, %fusion.loop_body.dim.0
%fusion.indvar.dim.1 = load i64, ptr %fusion.invar_address.dim.1, align 8
%3 = icmp uge i64 %fusion.indvar.dim.1, 1
br i1 %3, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1
fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1
%4 = getelementptr inbounds [3 x [1 x half]], ptr %Arg_0.1, i64 0, i64 %fusion.indvar.dim.0, i64 0
%5 = load half, ptr %4, align 2, !invariant.load !0 , !noalias !3
%6 = fpext half %5 to float
%7 = call float @llvm.fabs.f32(float %6)
%constant.121 = load float, ptr @29, align 4
%compare.2 = fcmp ole float %7, %constant.121
%8 = zext i1 %compare.2 to i8
%constant.120 = load float, ptr @0, align 4
%multiply.95 = fmul float %7, %constant.120
%constant.119 = load float, ptr @5, align 4
%add.82 = fadd float %multiply.95, %constant.119
%constant.118 = load float, ptr @4, align 4
%multiply.94 = fmul float %add.82, %constant.118
%constant.117 = load float, ptr @19, align 4
%add.81 = fadd float %multiply.94, %constant.117
%multiply.92 = fmul float %add.82, %add.81
%constant.116 = load float, ptr @18, align 4
%add.79 = fadd float %multiply.92, %constant.116
%multiply.91 = fmul float %add.82, %add.79
%subtract.87 = fsub float %multiply.91, %add.81
%constant.115 = load float, ptr @20, align 4
%add.78 = fadd float %subtract.87, %constant.115
%multiply.89 = fmul float %add.82, %add.78
%subtract.86 = fsub float %multiply.89, %add.79
%constant.114 = load float, ptr @17, align 4
%add.76 = fadd float %subtract.86, %constant.114
%multiply.88 = fmul float %add.82, %add.76
%subtract.84 = fsub float %multiply.88, %add.78
%constant.113 = load float, ptr @21, align 4
%add.75 = fadd float %subtract.84, %constant.113
%multiply.86 = fmul float %add.82, %add.75
%subtract.83 = fsub float %multiply.86, %add.76
%constant.112 = load float, ptr @16, align 4
%add.73 = fadd float %subtract.83, %constant.112
%multiply.85 = fmul float %add.82, %add.73
%subtract.81 = fsub float %multiply.85, %add.75
%constant.111 = load float, ptr @22, align 4
%add.72 = fadd float %subtract.81, %constant.111
%multiply.83 = fmul float %add.82, %add.72
%subtract.80 = fsub float %multiply.83, %add.73
%constant.110 = load float, ptr @15, align 4
%add.70 = fadd float %subtract.80, %constant.110
%multiply.82 = fmul float %add.82, %add.70
%subtract.78 = fsub float %multiply.82, %add.72
%constant.109 = load float, ptr @23, align 4
%add.69 = fadd float %subtract.78, %constant.109
%multiply.80 = fmul float %add.82, %add.69
%subtract.77 = fsub float %multiply.80, %add.70
%constant.108 = load float, ptr @14, align 4
%add.68 = fadd float %subtract.77, %constant.108
%multiply.79 = fmul float %add.82, %add.68
%subtract.75 = fsub float %multiply.79, %add.69
%constant.107 = load float, ptr @24, align 4
%add.67 = fadd float %subtract.75, %constant.107
%multiply.77 = fmul float %add.82, %add.67
%subtract.74 = fsub float %multiply.77, %add.68
%constant.106 = load float, ptr @13, align 4
%add.66 = fadd float %subtract.74, %constant.106
%multiply.76 = fmul float %add.82, %add.66
%subtract.72 = fsub float %multiply.76, %add.67
%constant.105 = load float, ptr @25, align 4
%add.65 = fadd float %subtract.72, %constant.105
%multiply.74 = fmul float %add.82, %add.65
%subtract.71 = fsub float %multiply.74, %add.66
%constant.104 = load float, ptr @12, align 4
%add.64 = fadd float %subtract.71, %constant.104
%multiply.73 = fmul float %add.82, %add.64
%subtract.69 = fsub float %multiply.73, %add.65
%constant.103 = load float, ptr @26, align 4
%add.63 = fadd float %subtract.69, %constant.103
%multiply.71 = fmul float %add.82, %add.63
%subtract.67 = fsub float %multiply.71, %add.64
%constant.102 = load float, ptr @11, align 4
%add.62 = fadd float %subtract.67, %constant.102
%multiply.70 = fmul float %add.82, %add.62
%subtract.66 = fsub float %multiply.70, %add.63
%constant.101 = load float, ptr @28, align 4
%add.61 = fadd float %subtract.66, %constant.101
%multiply.68 = fmul float %add.82, %add.61
%subtract.65 = fsub float %multiply.68, %add.62
%constant.100 = load float, ptr @27, align 4
%add.60 = fadd float %subtract.65, %constant.100
%subtract.64 = fsub float %add.60, %add.62
%multiply.66 = fmul float %subtract.64, %constant.120
%constant.99 = load float, ptr @6, align 4
%divide.4 = fdiv float %constant.99, %7
%add.59 = fadd float %divide.4, %constant.119
%multiply.65 = fmul float %add.59, %constant.118
%constant.98 = load float, ptr @3, align 4
%add.58 = fadd float %multiply.65, %constant.98
%multiply.64 = fmul float %add.59, %add.58
%constant.97 = load float, ptr @7, align 4
%add.57 = fadd float %multiply.64, %constant.97
%multiply.63 = fmul float %add.59, %add.57
%subtract.63 = fsub float %multiply.63, %add.58
%constant.96 = load float, ptr @2, align 4
%add.56 = fadd float %subtract.63, %constant.96
%multiply.62 = fmul float %add.59, %add.56
%subtract.62 = fsub float %multiply.62, %add.57
%constant.95 = load float, ptr @8, align 4
%add.55 = fadd float %subtract.62, %constant.95
%multiply.61 = fmul float %add.59, %add.55
%subtract.61 = fsub float %multiply.61, %add.56
%constant.94 = load float, ptr @1, align 4
%add.54 = fadd float %subtract.61, %constant.94
%multiply.60 = fmul float %add.59, %add.54
%subtract.60 = fsub float %multiply.60, %add.55
%constant.93 = load float, ptr @10, align 4
%add.53 = fadd float %subtract.60, %constant.93
%multiply.59 = fmul float %add.59, %add.53
%subtract.59 = fsub float %multiply.59, %add.54
%constant.92 = load float, ptr @9, align 4
%add.52 = fadd float %subtract.59, %constant.92
%subtract.58 = fsub float %add.52, %add.54
%multiply.58 = fmul float %subtract.58, %constant.120
%9 = call float @llvm.sqrt.f32(float %7)
%10 = fdiv float 1.000000e+00, %9
%multiply.57 = fmul float %multiply.58, %10
%11 = trunc i8 %8 to i1
%12 = select i1 %11, float %multiply.66, float %multiply.57
%13 = fptrunc float %12 to half
%14 = getelementptr inbounds [3 x [1 x half]], ptr %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 0
store half %13, ptr %14, align 2, !alias.scope !3
%invar.inc1 = add nuw nsw i64 %fusion.indvar.dim.1, 1
store i64 %invar.inc1, ptr %fusion.invar_address.dim.1, align 8
br label %fusion.loop_header.dim.1
fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1
%invar.inc = add nuw nsw i64 %fusion.indvar.dim.0, 1
store i64 %invar.inc, ptr %fusion.invar_address.dim.0, align 8
br label %fusion.loop_header.dim.0
fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0
br label %return
}
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.fabs.f32(float %0) #1
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.sqrt.f32(float %0) #1
attributes #0 = { uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="false" }
attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
!0 = !{}
!1 = !{i64 6}
!2 = !{i64 8}
!3 = !{!4}
!4 = !{!"buffer: {index:0, offset:0, size:6}", !5}
!5 = !{!"XLA global AA domain"}
2022-06-15 18:04:42 -04:00
Simon Pilgrim
4204361fed
[X86] X86InstrInfo.cpp - fix signed/unsigned promotion warnings in addImm calls
...
addImm takes a int64_t arg but we were using uint64_t types
2022-06-15 18:21:43 +01:00
Paul Robinson
654a835c3f
[PS5] Trap after noreturn calls, with special case for stack-check-fail
2022-06-15 09:02:17 -07:00
Phoebe Wang
e1c5afa47d
Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
...
Fixed the missing SQRT promotion. Adding several missing operations too.
2022-06-15 23:00:18 +08:00
Thomas Joerg
37455b1f71
Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
...
This reverts commit 6e02e27536 .
This introduces a crash in the backend. Reproducer in MLIR's LLVM
dialect follows. Let me know if you have trouble reproducing this.
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>)
llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00")
llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.mlir.constant(8 : i32) : i32
%1 = llvm.mlir.constant(8 : index) : i64
%2 = llvm.mlir.constant(2 : index) : i64
%3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16>
%4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32>
%5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16>
%6 = llvm.mlir.constant(false) : i1
%7 = llvm.mlir.constant(1 : i32) : i32
%8 = llvm.mlir.constant(0 : i32) : i32
%9 = llvm.mlir.constant(4 : index) : i64
%10 = llvm.mlir.constant(0 : index) : i64
%11 = llvm.mlir.constant(1 : index) : i64
%12 = llvm.mlir.constant(-1 : index) : i64
%13 = llvm.mlir.null : !llvm.ptr<f16>
%14 = llvm.getelementptr %13[%9] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%15 = llvm.ptrtoint %14 : !llvm.ptr<f16> to i64
%16 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%17 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%18 = llvm.mlir.null : !llvm.ptr<i64>
%19 = llvm.getelementptr %18[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%20 = llvm.ptrtoint %19 : !llvm.ptr<i64> to i64
%21 = llvm.alloca %20 x i64 : (i64) -> !llvm.ptr<i64>
llvm.br ^bb1(%10 : i64)
^bb1(%22: i64): // 2 preds: ^bb0, ^bb2
%23 = llvm.icmp "slt" %22, %arg1 : i64
llvm.cond_br %23, ^bb2, ^bb3
^bb2: // pred: ^bb1
%24 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%25 = llvm.getelementptr %24[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%26 = llvm.add %22, %11 : i64
%27 = llvm.getelementptr %25[%26] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%28 = llvm.load %27 : !llvm.ptr<i64>
%29 = llvm.getelementptr %21[%22] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %28, %29 : !llvm.ptr<i64>
llvm.br ^bb1(%26 : i64)
^bb3: // pred: ^bb1
llvm.br ^bb4(%10, %11 : i64, i64)
^bb4(%30: i64, %31: i64): // 2 preds: ^bb3, ^bb5
%32 = llvm.icmp "slt" %30, %arg1 : i64
llvm.cond_br %32, ^bb5, ^bb6
^bb5: // pred: ^bb4
%33 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%34 = llvm.getelementptr %33[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%35 = llvm.add %30, %11 : i64
%36 = llvm.getelementptr %34[%35] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%37 = llvm.load %36 : !llvm.ptr<i64>
%38 = llvm.mul %37, %31 : i64
llvm.br ^bb4(%35, %38 : i64, i64)
^bb6: // pred: ^bb4
%39 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
%40 = llvm.getelementptr %39[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%41 = llvm.load %40 : !llvm.ptr<ptr<f16>>
%42 = llvm.getelementptr %13[%11] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%43 = llvm.ptrtoint %42 : !llvm.ptr<f16> to i64
%44 = llvm.alloca %7 x i32 : (i32) -> !llvm.ptr<i32>
llvm.store %8, %44 : !llvm.ptr<i32>
%45 = llvm.call @_mlir_ciface_tf_alloc(%arg0, %31, %43, %8, %7, %44) : (!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
%46 = llvm.bitcast %45 : !llvm.ptr<i8> to !llvm.ptr<f16>
%47 = llvm.icmp "eq" %31, %10 : i64
%48 = llvm.or %6, %47 : i1
%49 = llvm.mlir.null : !llvm.ptr<i8>
%50 = llvm.icmp "ne" %45, %49 : !llvm.ptr<i8>
%51 = llvm.or %50, %48 : i1
llvm.cond_br %51, ^bb7, ^bb13
^bb7: // pred: ^bb6
%52 = llvm.urem %31, %9 : i64
%53 = llvm.sub %31, %52 : i64
llvm.br ^bb8(%10 : i64)
^bb8(%54: i64): // 2 preds: ^bb7, ^bb9
%55 = llvm.icmp "slt" %54, %53 : i64
llvm.cond_br %55, ^bb9, ^bb10
^bb9: // pred: ^bb8
%56 = llvm.mul %54, %11 : i64
%57 = llvm.add %56, %10 : i64
%58 = llvm.add %57, %10 : i64
%59 = llvm.getelementptr %41[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%60 = llvm.bitcast %59 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%61 = llvm.load %60 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%62 = "llvm.intr.sqrt"(%61) : (vector<4xf16>) -> vector<4xf16>
%63 = llvm.fdiv %5, %62 : vector<4xf16>
%64 = llvm.getelementptr %46[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%65 = llvm.bitcast %64 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %63, %65 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%66 = llvm.add %54, %9 : i64
llvm.br ^bb8(%66 : i64)
^bb10: // pred: ^bb8
%67 = llvm.icmp "ult" %53, %31 : i64
llvm.cond_br %67, ^bb11, ^bb12
^bb11: // pred: ^bb10
%68 = llvm.mul %53, %12 : i64
%69 = llvm.add %31, %68 : i64
%70 = llvm.mul %53, %11 : i64
%71 = llvm.add %70, %10 : i64
%72 = llvm.trunc %69 : i64 to i32
%73 = llvm.mlir.undef : vector<4xi32>
%74 = llvm.insertelement %72, %73[%8 : i32] : vector<4xi32>
%75 = llvm.shufflevector %74, %73 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xi32>, vector<4xi32>
%76 = llvm.icmp "slt" %4, %75 : vector<4xi32>
%77 = llvm.add %71, %10 : i64
%78 = llvm.getelementptr %41[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%79 = llvm.bitcast %78 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%80 = llvm.intr.masked.load %79, %76, %3 {alignment = 2 : i32} : (!llvm.ptr<vector<4xf16>>, vector<4xi1>, vector<4xf16>) -> vector<4xf16>
%81 = llvm.bitcast %16 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %80, %81 : !llvm.ptr<vector<4xf16>>
%82 = llvm.load %81 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%83 = "llvm.intr.sqrt"(%82) : (vector<4xf16>) -> vector<4xf16>
%84 = llvm.fdiv %5, %83 : vector<4xf16>
%85 = llvm.bitcast %17 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %84, %85 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%86 = llvm.load %85 : !llvm.ptr<vector<4xf16>>
%87 = llvm.getelementptr %46[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%88 = llvm.bitcast %87 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.intr.masked.store %86, %88, %76 {alignment = 2 : i32} : vector<4xf16>, vector<4xi1> into !llvm.ptr<vector<4xf16>>
llvm.br ^bb12
^bb12: // 2 preds: ^bb10, ^bb11
%89 = llvm.mul %2, %1 : i64
%90 = llvm.mul %arg1, %2 : i64
%91 = llvm.add %90, %11 : i64
%92 = llvm.mul %91, %1 : i64
%93 = llvm.add %89, %92 : i64
%94 = llvm.alloca %93 x i8 : (i64) -> !llvm.ptr<i8>
%95 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %46, %95 : !llvm.ptr<ptr<f16>>
%96 = llvm.getelementptr %95[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %46, %96 : !llvm.ptr<ptr<f16>>
%97 = llvm.getelementptr %95[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%98 = llvm.bitcast %97 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %98 : !llvm.ptr<i64>
%99 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>
%100 = llvm.getelementptr %99[%10, 3] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>, i64) -> !llvm.ptr<i64>
%101 = llvm.getelementptr %100[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%102 = llvm.sub %arg1, %11 : i64
llvm.br ^bb14(%102, %11 : i64, i64)
^bb13: // pred: ^bb6
%103 = llvm.mlir.addressof @error_message_2208944672953921889 : !llvm.ptr<array<42 x i8>>
%104 = llvm.getelementptr %103[%10, %10] : (!llvm.ptr<array<42 x i8>>, i64, i64) -> !llvm.ptr<i8>
llvm.call @_mlir_ciface_tf_report_error(%arg0, %0, %104) : (!llvm.ptr<i8>, i32, !llvm.ptr<i8>) -> ()
%105 = llvm.mul %2, %1 : i64
%106 = llvm.mul %2, %10 : i64
%107 = llvm.add %106, %11 : i64
%108 = llvm.mul %107, %1 : i64
%109 = llvm.add %105, %108 : i64
%110 = llvm.alloca %109 x i8 : (i64) -> !llvm.ptr<i8>
%111 = llvm.bitcast %110 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %13, %111 : !llvm.ptr<ptr<f16>>
%112 = llvm.getelementptr %111[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %13, %112 : !llvm.ptr<ptr<f16>>
%113 = llvm.getelementptr %111[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%114 = llvm.bitcast %113 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %114 : !llvm.ptr<i64>
%115 = llvm.call @malloc(%109) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%115, %110, %109, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%116 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%117 = llvm.insertvalue %10, %116[0] : !llvm.struct<(i64, ptr<i8>)>
%118 = llvm.insertvalue %115, %117[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %118 : !llvm.struct<(i64, ptr<i8>)>
^bb14(%119: i64, %120: i64): // 2 preds: ^bb12, ^bb15
%121 = llvm.icmp "sge" %119, %10 : i64
llvm.cond_br %121, ^bb15, ^bb16
^bb15: // pred: ^bb14
%122 = llvm.getelementptr %21[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%123 = llvm.load %122 : !llvm.ptr<i64>
%124 = llvm.getelementptr %100[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %123, %124 : !llvm.ptr<i64>
%125 = llvm.getelementptr %101[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %120, %125 : !llvm.ptr<i64>
%126 = llvm.mul %120, %123 : i64
%127 = llvm.sub %119, %11 : i64
llvm.br ^bb14(%127, %126 : i64, i64)
^bb16: // pred: ^bb14
%128 = llvm.call @malloc(%93) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%128, %94, %93, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%129 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%130 = llvm.insertvalue %arg1, %129[0] : !llvm.struct<(i64, ptr<i8>)>
%131 = llvm.insertvalue %128, %130[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %131 : !llvm.struct<(i64, ptr<i8>)>
}
llvm.func @_mlir_ciface_Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<struct<(i64, ptr<i8>)>>, %arg1: !llvm.ptr<i8>, %arg2: !llvm.ptr<struct<(i64, ptr<i8>)>>) attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.load %arg2 : !llvm.ptr<struct<(i64, ptr<i8>)>>
%1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
%2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
%3 = llvm.call @Rsqrt_CPU_DT_HALF_DT_HALF(%arg1, %1, %2) : (!llvm.ptr<i8>, i64, !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)>
llvm.store %3, %arg0 : !llvm.ptr<struct<(i64, ptr<i8>)>>
llvm.return
}
}
2022-06-15 13:24:24 +02:00
Simon Pilgrim
cf2072bcad
[X86] X86TargetTransformInfo.cpp - use InstructionCost type to accumulate instructions costs
2022-06-15 12:21:01 +01:00
Benjamin Kramer
fb34d531af
Promote bf16 to f32 when the target doesn't support it
...
This is modeled after the half-precision fp support. Two new nodes are
introduced for casting from and to bf16. Since casting from bf16 is a
simple operation I opted to always directly lower it to integer
arithmetic. The other way round is more complicated if you want to
preserve IEEE semantics, so it's handled by a new __truncsfbf2
compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened
fadd on x86.
Possible future improvements:
- Targets with bf16 conversion instructions can now make fp_to_bf16 legal
- The software conversion to bf16 can be replaced by a trivial
implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
2022-06-15 12:56:31 +02:00
Simon Pilgrim
4fd561415e
[X86] needCarryOrOverflowFlag/onlyZeroFlagUsed - merge identical switch cases. NFCI.
...
Makes it easier to grok and fixes various bugprone-branch-clone warnings.
2022-06-15 10:40:22 +01:00
Amir Ayupov
5965878d4d
[X86][NFC] Use mnemonic tables in validateInstruction 4/4
...
Group switch cases by opcode:
- VGATHERDPD
- VGATHERDPS
- VGATHERQPD
- VGATHERQPS
- VPGATHERDD
- VPGATHERDQ
- VPGATHERQD
- VPGATHERQQ
Distinguish masked vs non-masked forms by EVEX encoding.
Reviewed By: skan, craig.topper
Differential Revision: https://reviews.llvm.org/D127719
2022-06-14 19:53:44 -07:00
Luo, Yuanke
54ec8e25fc
[X86][AMX] Fix klockwork issue.
2022-06-15 09:26:59 +08:00
Phoebe Wang
6e02e27536
Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"
...
Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see
the issue here https://github.com/llvm/llvm-project/issues/55992
2022-06-15 09:15:31 +08:00
Amir Ayupov
6226e46c5f
[X86][NFC] Use mnemonic tables in validateInstruction 3/4
...
Group switch cases by opcode:
- V4FMADDPS
- V4FMADDSS
- V4FNMADDPS
- V4FNMADDSS
- VP4DPWSSDS
- VP4DPWSSD
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D127718
2022-06-14 12:11:47 -07:00
Amir Ayupov
df16c077dc
[X86][NFC] Use mnemonic tables in validateInstruction 2/4
...
Group switch cases by opcode:
- VFCMULCPH
- VFCMULCSH
- VFMULCPH
- VFMULCSH
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D127717
2022-06-14 12:09:37 -07:00
Amir Ayupov
4bf928bce4
[X86][NFC] Use mnemonic tables in validateInstruction 1/4
...
Group switch cases by opcode:
- VFCMADDCPH
- VFCMADDCSH
- VFMADDCPH
- VFMADDCSH
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D127716
2022-06-14 12:06:23 -07:00
Simon Pilgrim
64eea34420
[X86] combineEXTEND_VECTOR_INREG - don't attempt to shuffle combine ANY_EXTEND_VECTOR_INREG without SSE41
...
Without SSE41, ANY_EXTEND_VECTOR_INREG nodes are likely to be prematurely combined to a target shuffle preventing generic sign extension folds.
Fixes a number of sign-extend regressions in D127115.
2022-06-13 17:42:04 +01:00
Maksim Panchenko
8f6512fea0
[X86][Disassembler] Fix displacement operand size for symbolizer
...
On 64-bit X86, 0x66 operand-size override prefix will change the size of
the instruction operand, e.g. from 32 bits to 16 bits, but it will not
modify the size of the displacement operand used for memory addressing,
which will always be 32 bits.
Reviewed By: skan, rafauler
Differential Revision: https://reviews.llvm.org/D126726
2022-06-13 00:14:43 -07:00
Kazu Hirata
92ab024f81
[X86] Use default member initialization (NFC)
...
Identified with modernize-use-default-member-init.
2022-06-12 18:30:46 -07:00
Jez Ng
d4bcb45db7
[MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible
...
This reverts commit d941d59783 .
Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 17:24:19 -04:00
Mehdi Amini
5d8298a768
Revert "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"
...
This reverts commit 2d2da259c8 .
This breaks MLIR integration test (JIT crashing), reverting in the
meantime.
2022-06-12 15:14:37 +00:00
Jez Ng
d941d59783
Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible"
...
This reverts commit ef501bf85d .
2022-06-12 10:47:08 -04:00
Simon Pilgrim
b5d7beeb97
[X86] combineConcatVectorOps - add support for concatenation of VSELECT/BLENDV nodes (REAPPLIED)
...
If the LHS/RHS selection operands can be cheaply concatenated back together then replace 2 x 128-bit selection nodes with 1 x 256-bit node
Addresses the regression introduced in the bug fix from rGd5af6a38082b39ae520a328e44dc29ebcb036bb2
REAPPLIED with for bug identified in rGea8fb3b60196
2022-06-12 15:40:36 +01:00
Jez Ng
ef501bf85d
[MC] Omit DWARF unwind info if compact unwind is present where eligible
...
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 10:03:56 -04:00
Phoebe Wang
2d2da259c8
[X86][RFC] Enable `_Float16` type support on X86 following the psABI
...
GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following
the latest X86 psABI. (https://gitlab.com/x86-psABIs )
_Float16 arithmetic will be performed using native half-precision. If
native arithmetic instructions are not available, it will be performed
at a higher precision (currently always float) and then truncated down
to _Float16 immediately after each single arithmetic operation.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D107082
2022-06-12 11:40:00 +08:00
Simon Pilgrim
7841d09449
[X86][AVX512] Retain pmuldq broadcast loads on 32-bit targets
...
Don't demand just the lower 32-bits on 32-bit AVX512 targets to preserve 64-bit broadcast loads patterns
2022-06-11 19:30:00 +01:00
Simon Pilgrim
6eaea225c7
[X86] combineTargetShuffle - break if-else chain. NFC.
...
(style) Both cases always continue.
2022-06-11 09:16:39 +01:00
Simon Pilgrim
89d2b1e4f7
[X86] emitOrXorXorTree - break if-else chain. NFC.
...
(style) Both cases always return.
2022-06-11 09:16:38 +01:00
Fangrui Song
adf4142f76
[MC] De-capitalize SwitchSection. NFC
...
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Eli Friedman
0ff51d5dde
Fix interaction of CFI instructions with MachineOutliner.
...
1. When checking if a candidate contains a CFI instruction, actually
iterate over all of the instructions, instead of stopping halfway
through.
2. Make sure copied CFI directives refer to the correct instruction.
Fixes https://github.com/llvm/llvm-project/issues/55842
Differential Revision: https://reviews.llvm.org/D126930
2022-06-10 13:37:49 -07:00
Guillaume Chatelet
38637ee477
[clang] Add support for __builtin_memset_inline
...
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
2022-06-10 13:13:59 +00:00
Simon Pilgrim
5acbb2dda2
[X86] combineMulToPMADDWD - don't bitcast the source ops before splitting to ensure we split the build vectors early
...
Fixes a regression on D127115 - splitting was creating extract_subvector(bitcast(build_vector())) patterns which prevented the build vectors being split before being bitcast to vXi16 types, resulting in various issues with further folding of the (now legal) build vectors
2022-06-10 13:44:49 +01:00
Simon Pilgrim
7ac33b8aac
[X86] Remove !VT.is128BitVector() check. NFCI.
...
The code is inside a if(VT.is256BitVector() || VT.is512BitVector()) condition
2022-06-09 21:39:45 +01:00
Simon Pilgrim
72a049d778
[X86][AVX2] LowerINSERT_VECTOR_ELT - support v4i64 insertion as BLENDI(X, SCALAR_TO_VECTOR(Y))
2022-06-09 21:18:10 +01:00
Simon Pilgrim
1a02db9882
[X86] canonicalizeShuffleWithBinOps - add TODO for X86ISD::ANDNP bitwise handling
...
Its just as safe to move shuffles across X86ISD::ANDNP as any other logical bitop, they just tend to appear too late to matter.
Noticed while triaging D127115 regressions.
2022-06-09 12:18:26 +01:00
Guillaume Chatelet
dc3367970e
[SelectionDAG] Handle bzero/memset libcalls globally instead of per target
...
Differential Revision: https://reviews.llvm.org/D127279
2022-06-09 08:34:55 +00:00
Simon Pilgrim
9a76337fee
[X86] combineMOVMSK - constant fold with getTargetConstantBitsFromNode not just BUILD_VECTOR
...
Help avoid a regression in D127115
2022-06-08 17:48:55 +01:00
Matt Arsenault
cc5a1b3dd9
llvm-reduce: Add cloning of target MachineFunctionInfo
...
MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.
Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.
2022-06-07 10:14:48 -04:00
Guillaume Chatelet
0788186182
[Alignment][NFC] Remove usage of MemSDNode::getAlignment
...
I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.
Differential Revision: https://reviews.llvm.org/D126910
2022-06-07 13:52:20 +00:00
Simon Pilgrim
f5507978a3
[X86] getFauxShuffleMask - add VSELECT/BLENDV handling
...
First step towards enabling shuffle combining starting from VSELECT/BLENDV nodes - this should eventually help improve the codegen reported at Issue #54819
2022-06-07 14:46:25 +01:00
Simon Pilgrim
5cea1553b8
[X86] X86SpeculativeLoadHardening.cpp - pass DebugLoc by const reference not value.
2022-06-07 12:38:05 +01:00
Simon Pilgrim
1b6d3bdc82
[X86] foldMaskedMergeImpl - pass SDLoc by const reference not value.
2022-06-07 12:36:30 +01:00
Simon Pilgrim
63e3035dbe
[X86] LowerGC_TRANSITION - remove redundant SDLoc().
2022-06-07 10:57:58 +01:00
Fangrui Song
15d82c62dc
[MC] De-capitalize MCStreamer functions
...
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Shilei Tian
0c3e6e5717
[NFC] Remove trailing whitespace
2022-06-06 18:59:13 -04:00
Fangrui Song
77e300ffdf
[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"
2022-06-05 15:11:01 -07:00
Kazu Hirata
9a8e65de8c
[Target] Use MachineBasicBlock::erase (NFC)
2022-06-04 22:41:24 -07:00