Based off the script from D103695, on AVX1, Jaguar/Bulldozer both have low throughput for ymm select patterns (BLENDV + OR(AND,ANDN))), and even on AVX2 Haswell still struggles with BLENDV ops
Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput
Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instead of effective throughput
Split off ABS cost handling from MIN/MAX and use explicit predicates for each
Our generic expansion of ABS doesn't use NEG+CMP+SELECT any more (its now ASHR+ADD+XOR) so this needs to be updated.
As suggested on D111024, we should treat getCmpSelInstrCost calls without a specific predicate as matching the worst case predicate cost.
These regressions will be addressed with a mixture of D111024 and fixing other specific getCmpSelInstrCost calls to have realistic predicates.